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Foreword 


The  International  Colloquium  on  Algorithms,  Languages  and  Programming 
(ICALP)  is  the  annual  conference  of  the  European  Association  for  Theoretical 
Computer  Science  (EATCS).  The  conference  aims  at  enabling  computer  scien¬ 
tists  to  exchange  theoretical  ideas  and  results,  as  well  as  at  stimulating  cooper¬ 
ation  between  the  theoretical  and  the  practical  community  in  computer  science. 

The  main  topics  of  ICALP  ’97  included  computability,  automata,  formal 
languages,  new  computing  paradigms,  term  rewriting,  analysis  and  design  of  al¬ 
gorithms,  computational  geometry,  computational  complexity,  symbolic  and  al¬ 
gebraic  computation,  cryptography  and  security,  data  types  and  data  structures, 
theory  of  data  base  and  knowledge  bases,  semantics  of  programming  languages, 
program  specification  and  verification,  foundations  of  logic  programming,  par¬ 
allel  and  distributed  computation,  theory  of  concurrency,  theory  of  robotics, 
theory  of  logical  design  and  layout. 

ICALP  ’97  was  held  in  Bologna,  Italy,  July  7-11,  1997.  Previous  colloquia 
took  place  in  Paderborn  (1996),  Szeged  (1995),  Jerusalem  (1994),  Lund  (1993), 
Wien  (1992),  Madrid  (1991),  Warwick  (1990),  Stresa  (1989),  Tampere  (1988), 
Karlsruhe  (1987),  Rennes  (1986),  Nafplion  (1985),  Antwerpen  (1984),  Barcelona 
(1983),  Aarhus  (1982),  Haifa  (1981),  Amsterdam  (1980),  Graz  (1979),  Udine 
(1978),  Turku  (1977),  Edinburgh  (1976),  Saarbiicken  (1974),  and  Paris  (1972). 
The  next  ICALP  will  be  held  in  Aalborg,  Denmark,  July  13-17,  1998. 

ICALP  ’97  came  in  conjunction  with  the  25th  anniversary  of  EATCS.  The 
celebration  of  the  association  and  of  its  founders  included  a  historical  perspective 
on  the  achievements  of  the  community  in  the  last  25  years  with  a  talk  by  M.Nivat, 
the  first  EATCS  President,  and  a  discussion  on  the  new  challenges  that  EATCS 
will  face  in  the  future. 

ICALP  ’97  was  organised  differently  than  before  and  accommodated  further 
events,  to  react  positively  to  the  new  challenges  that  the  theoretical  science  com¬ 
munity  faces  in  the  information  technology  society.  Indeed,  our  community  has 
developed  and  now  utilizes  several  approaches  and  different  methodologies  that 
require  increased  specialization.  As  a  consequence,  there  is  a  growing  number 
of  specialized  conferences  and  workshops,  and  it  is  difficult  for  researchers  to 
follow  the  recent  developments  on  specialized  research  topics.  ICALP  ’97  was 
a  first  step  towards  having  a  conference  offering  a  single  unifying  environment 
while  leaving  room  for  specialization.  In  such  an  event,  the  computer  science 
community  interested  in  the  development  of  formal  methods  and  methodolo¬ 
gies  can  stress  the  relationships  that  exist  among  different  branches.  The  new 
organization  of  ICALP  ’97  can  be  summarized  as  follows. 

Invited  talks  There  were  more  invited  presentations  than  usual.  The  eight 
talks  presented  the  main  developments  occurring  in  a  specific  area  and  the 
promising  new  trends. 

Plenary  and  parallel  sessions  Some  papers  were  presented  in  plenary  ses¬ 
sions.  Parallel  sessions  were  organised  for  the  other  submitted  papers,  ac¬ 
cording  to  the  two  tracks  of  the  Journal  of  Theoretical  Computer  Science; 


VI 


this  reflects  the  main  division  in  research  topics  within  the  community,  while 
making  evident  its  unifying  aspects. 

Satellite  workshops  Seven  satellite  workshop  were  held  immediately  before 
or  after  the  main  conference.  Their  specific  topics  were  often  at  the  inter¬ 
face  between  theoretical  computer  science  and  other  information  technology 
research  areas. 

Policy  of  resoarch  funding  A  panel  discussion  was  held,  with  panelists  in¬ 
cluding  experts  responsible  for  governmental  and  industrial  research  and 
development  agencies  in  Europe  and  the  U.S. 

The  Program  Committee  selected  73  papers  out  of  197  submissions,  183  of 
which  were  in  electronic  format.  Their  authors  are  from  30  countries  from  all  over 
the  world.  Each  submission  has  been  sent  to  four  Program  Committee  members, 
assisted  by  their  own  referees. 

The  selection  meeting  took  place  in  Bologna,  March  15-16,  1997.  To  permit 
a  deeper  evaluation  of  the  papers,  the  Program  Committee  split  in  two  parts  for 
a  preliminary  discussion,  according  to  the  division  mentioned  above.  Then,  all 
the  papers  were  evaluated  again  and  all  the  decisions  were  taken  altogether. 

We  would  like  to  warmly  thank  all  the  Program  Committee  members  and 
their  referees  for  their  invaluable  contribution. 

We  are  deeply  indebted  with  all  the  members  of  the  Organizing  Committee 
for  all  their  time  and  efforts.  A  special  “grazie”  to  Vladimiro  Sassone  for  his 
excellent  automatic  system  that  supported  us  through  all  the  preparation  of  the 
colloquium,  from  receiving  submissions  and  referees’  reports  to  the  preparation 
of  the  selection  meeting  and  of  the  proceedings.  “Grazie”  also  to  Chiara  Bodei 
for  her  precious  help. 

Finally,  we  gratefully  acknowledge  support  from  the  UE  -  DG  III,  UNESCO 
Venice  Office,  Italian  National  Council  of  Research  (Comitati  01,  07,  12),  GNIM- 
CNR,  lEI-CNR,  the  Universities  of  Bologna,  Pisa,  and  Roma  “La  Sapienza”,  the 
Regione  Emilia-Romagna,  TELECOM  Italia,  and  the  United  States  Air  Force 
European  Office  of  Aerospace  Research  and  Development. 

April  1997 


Pierpaolo  Degano,  Roberto  Gorrieri,  Alberto  Marchetti-Spaccamela 
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Graphical  Calculi  for  Interaction 


Robin  Milner 

University  of  Cambridge,  UK 

R.ecently  there  has  been  great  interest  in  operational  models  of  interactive 
systems,  and  more  recently  especially  in  those  which  capture  to  some  extent  the 
elusive  notion  of  mobility.  The  7r-calculus  [1]  is  one  such  model,  and  has  had 
some  success  both  in  application  and  in  prompting  research  in  abstract  models 
of  interaction.  But  it  can  hardly  claim  to  be  canonical,  and  indeed  nor  can  any 
of  the  other  operational  models. 

We  might  consider  that  the  quest  for  a  canonical  model  of  interaction  is  no 
more  likely  to  succeed  than  that  for  a  canonical  model  of  computation.  (In  the 
latter  case,  we  have  to  be  content  with  many  models  -  Turing  machines,  register 
machines,  ...-  and  with  translating  between  them.)  Nonetheless,  it  would  be 
timid  not  to  seek  aspects  which  are  common  to  many,  or  even  most,  models  of 
interactive  behaviour. 

In  around  1992  I  started  from  the  7r-calculus  and  tried  to  separate  what 
seemed  ad  hoc  from  what  seemed  more  essential.  The  exact  communication 
discipline  of  the  7r-calculus  fell  into  the  ad  hoc  category;  the  rest  -  naming, 
restriction,  parallel  composition  ~  have  greater  claim  to  be  universal.  This  was 
the  origin  of  action  calculi  [2].  To  present  the  7r-calculus  as  an  action  calculus, 
one  starts  from  the  common  basis  of  action  calculi  and  merely  adds  two  or  three 
so-called  “controls”  -  for  message-passing  and  replication.  It  turns  out  that  the 
A-calculus,  the  object  calculus  of  Abadi  and  Cardelli,  and  many  recent  calculi 
can  be  similarly  set  up  -  and  combined  with  each  other  -  in  the  action-calculus 
framework.  Considerable  progress  has  been  made,  for  example  in  [3],  in  the 
uniform  treatment  of  models  of  action  calculi. 

In  the  conference  lecture  I  shall  emphasize  one  feature  of  action  calculi:  their 
graphical  presentation.  Several  examples  will  be  given  -  including  some  recent 
advances  in  calculi  for  representing  locality  -  showing  that  this  graphical  element 
is  exactly  what  all  action  calculi  have  in  common.  These  examples  motivate 
further  development  (which  is  certainly  needed)  in  the  general  theory  of  action 
calculi  and  their  models. 
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NP-Completeness:  A  Retrospective 


Christos  H.  Papadimitriou* 
University  of  California,  Berkeley,  USA 


Abstract.  For  a  quarter  of  a  century  now,  NP-completeness  has  been 
computer  science’s  favorite  paradigm,  fad,  punching  bag,  buzzword,  alibi, 
and  intellectual  export.  This  paper  is  a  fragmentary  commentary  on  its 
origins,  its  nature,  its  impact,  and  on  the  attributes  that  have  made  it 
so  pervasive  and  contagious. 


1.  A  keyword  search  in  Melvyl,  the  University  of  California’s  on-line  library, 
reveals  that  about  6,000  papers  each  year  have  the  term  “NP-complete’  on 
their  title,  abstract,  or  list  of  keywords.  This  is  more  than  each  of  the  terms 
"compiler,”  "database,”  "expert,”  "neural  network,”  and  “operating  system.” 
Even  more  surprising  is  the  diversity  of  the  disciplines  with  papers  referring  to 
"NP-completeness:”  They  range  from  statistics  and  artificial  life  to  automatic 
control  and  nuclear  engineering.  What  is  the  nature  and  extent  of  the  impact  of 
NP-completeness  on  theoretical  computer  science,  computer  science  in  general, 
computing  practice,  as  well  as  other  domains  of  the  natural  sciences,  applied  sci¬ 
ence,  and  mathematics?  And  why  did  NP-completeness  become  such  a  pervasive 
and  influential  concept? 

2.  One  of  the  reasons  of  the  immense  impact  of  NP-completeness  has  to  be 
the  appeal  and  elegance  of  the  class  P,  that  is,  of  the  thesis  that  "polynomial 
worst-case  time”  is  a  plausible  and  productive  mathematical  surrogate  of  the 
empirical  concept  of  “practically  solvable  computational  problem.”  But,  obvi¬ 
ously,  NP-completeness  also  draws  on  the  importance  of  NP,  as  it  rests  on  the 
widely  conjectured  contradistinction  between  these  two  classes.  In  this  regard, 
it  is  crucial  that  NP  captures  vast  domains  of  computational,  scientific,  and 
mathematical  endeavor,  and  seems  to  roughly  delimit  what  mathematicians  and 
scientists  had  been  aspiring  to  compute  feasibly.  True,  there  are  domains,  such 
as  strategic  analysis  and  counting,  which  have  been  within  our  computational 
ambitions,  and  still  seem  to  lie  outside  NP;  but  they  are  the  exceptions  rather 
than  the  rule.  NP-completeness  has  thus  become  a  valuable  intermediary  be¬ 
tween  the  abstraction  of  computational  models  and  the  reality  of  computational 
problems,  grounding  complexity  theory  to  computational  practice. 

3.  Also  crucial  for  the  success  of  NP-completeness  has  been  its  surprising  ubiq¬ 
uity  and  effectiveness  as  a  classification  tool,  and  the  scarcity  of  problems  in 

*  christos@cs.berkeley.edu.  Partially  supported  by  the  National  Science  Foundation. 
A  version  of  this  talk  was  given  at  a  meeting  in  the  Fall  of  1995  celebrating  the  60th 
birthday  of  Richard  M.  Karp,  to  whom  this  paper  is  also  affectionately  dedicated. 
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NP  that  resist  classification  as  either  polynomial- time  solvable  or  NP-complete. 
(Ladner’s  result  on  intermediate  degrees  between  P  and  NP-completeness  [12] 
had  been  known  almost  as  soon  as  NP-completeness  was  introduced,  and  thus 
theoretically  the  world  could  be  full  of  mysterious  intermediate  problems.)  In  sev¬ 
eral  occasions,  extremely  broad  classes  of  computational  problems  in  NP  have 
been  dichotomized  with  surprising  accuracy  into  polynomially  solvable  and  NP- 
complete,  see  [21,  22]  for  two  early  examples. 

4.  The  founders  of  NP-completeness  [2,  10,  13]  appear  to  have  anticipated  its 
broad  applicability  and  classification  power.  Leonid  Levin  [13]  wrote  in  1973: 
“The  method  described  here  clearly  provides  a  means  for  readily  obtaining  re¬ 
sults  of  [this  type]  for  the  majority  of  im.portant  sequential  search  problem.s.”  In 
Karp’s  paper  [10]  twenty  one  problems  were  proved  NP-complete,  showing  be¬ 
yond  any  doubt  the  surprisingly  broad  applicability  of  the  method.  Significantly, 
Karp  seems  annoyed  and  surprised  that  three  other  problems  (linear  program¬ 
ming,  primality,  and  graph  isomorphism)  resisted  at  the  time  such  classification. 
Primality  and  graph  isomorphism  were  also  mentioned  by  Cook  [2].  Knuth  was 
sufficiently  convinced  about  the  importance  and  broad  applicability  of  the  new 
concept  to  take  early  and  deliberate  action  on  the  terminological  front  [11]. 

5.  NP-completeness  has  had  tremendous  impact  even  in  areas  where,  in  some 
sense,  it  should  not  have.  It  is  now  common  knowledge  among  computer  sci¬ 
entists  that  NP-completeness  is  largely  irrelevant  to  public-key  cryptography, 
since  in  that  area  one  needs  sophisticated  cryptographic  assumptions  that  go 
beyond  NP-completeness  and  worst-case  polynomial-time  computation  [19];  fur¬ 
thermore,  cryptographic  protocols  based  on  NP-complete  problems  have  been 
ill-fated.  Fortunately,  the  founders  of  modern  cryptography  did  not  know  this. 
Diffie  and  Heilman  base  their  famous  pronouncement  “We  stand  today  on  the 
brink  of  a  revolution  in  cryptography”  [3]  on  two  facts:  (1)  Very  fast  hardware 
and  software,  and  (2)  novel  techniques  for  proving  problems  hard  (they  cite 
Karp’s  paper  [10]). 

6.  NP-completeness  has  also  exhibited  a  great  amount  of  versatility,  adapting 
to  contexts  and  computational  aspects  beyond  its  original  scope  of  worst-case 
analysis  of  exact  algorithms  for  decision  and  optimization  problems.  For  exam¬ 
ple,  it  was  used  early  on  to  show  that  certain  optimization  problems  cannot  be 
approximated  satisfactorily  [20],  and  indeed  in  a  most  ingenious  and  compre¬ 
hensive  way  more  recently  [1],  By  showing  that  even  less  ambitious  goals  than 
worst-case  polynomial  exact  solution  are  unattainable,  NP-completeness  is  thus 
a  most  useful  tool  for  repeatedly  pruning  unpromising  research  directions  and 
thus  redirecting  research  to  new  ones  (in  a  manner  reminiscent  of  the  struggle 
between  Hercules  and  the  monster  Hydra  [16]). 

7.  Let  me  illustrate  this  versatility  of  NP-completeness  by  a  technical  interlude 
on  an  aspect  of  efficient  computation  that  has  interested  me  recently,  namely, 
output  polynomial  time.  Certain  computational  problems  require  an  output  f{x) 
on  input  x  that  is  in  the  worst  case  exponential  in  the  input.  For  such  problems, 
one  would  like  to  have  algorithms  that  are  polynomial  in  |.t|  and  |/(ic)|.  The  class 
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of  problems  thus  solvable  can  be  called  output  polynomial  time.  One  can  use  NP- 
completeness  to  prove  that  certain  functions  are  not  in  output-polynomial  time, 
unless  P:=NP.  For  example,  consider  the  function  MIN  which  maps  a  regular 
expression  to  the  minimum-state  equivalent  deterministic  finite-state  automaton. 
MIN  can  be  computed  by  first  designing  a  nondeterministic  automaton  Af ,  then 
an  equivalent  deterministic  automaton  M',  and  next  minimizing  the  states  of 
M'  to  obtain  the  final  output;  the  problem  is,  of  course,  that  the  intermediate 
result  M'  could  be  exponential  in  both  the  input  and  the  output.  It  is  rather 
straightforward  to  use  “traditional’b  NP-completeness  techniques  to  show  the 
following: 

Theorem  1.  Unless  P=NP,  MIN  is  not  in  output  polynomial  time. 

In  fact,  we  cannot  even  compute  in^. output-polynomial  time  a  deterministic  au¬ 
tomaton  that  has  at  most  polynomiMy  more  states  than  the  minimum  — unless, 
of  course,  P=:::NP. 

8.  Often  the  required  output  f[x)  is  a.set  {yi, . .  . ,  of  strings  that  are  related 

to  X  via  an  NP  mapping;  for  example,  if  G  is  a  graph,  let  AMIS(G)  be  the  set  of 
all  maximu/ independent  sets  of  G.  AMIS  is  known  to  be  in  output-polynomial 
time  (see  [9]  for  an  exposition  and  strengthening  of  this  result,  and  an  early 
discussion  of  output  polynomial  time).  For  such  problems  we  have  an  elegant 
alternative  definition  of  output  polynomial  time.  A  function  f  :  E*  ^  2^  is 
in  output  polynomial  time  if  the  following  problem  is  solvable  in  polynomial 
time:  Given  a)  and  y  C  E\  either  decide  that  y  -  /(x),  or  find  a  string  in 
y  0  fi^x).  It  is  easy  to  see  that,  if  such  an  algorithm  exists,  then  its  iteration 
starting  with  5  =  0  gives  an  output  polynomial  time  algorithm  for  /;  and  vice- 
versa,  if  an  output  polynomial  time  algorithm  exists  for  /,  it  can  be  used  to 
produce  an  element  of  y  0  f(x).  For  example,  AMIS  is  in  output  polynomial 
time;  its  generalization  to  hypergraphs  is  open,  but  was  recently  shown  to  be 
in  output  time  [6];  see  [5]  for  an  extensive  discussion  of  the  hypergraph 

generalization  of  AMIS.  One  can  use  again  “traditional”  NP-completeness  to 
show  that  the  following  generalization  is  not  in  output  polynomial  time,  unless 
p=NP:  Given  a  monotone  circuit,  compute  the  set  of  all  minimal  (with  respect 
to  the  set  of  true  inputs)  satisfying  truth  assignments. 

9.  But,  sometimes,  “traditional”  NP-completeness  techniques  do  not  seem  to  suf¬ 
fice  to  bring  out  the  intractability  of  a  problem,  because  this  problem  belongs  to 
a  class  or  computational  mode  that  appears  to  be  “between”  P  and  NP .  In  such 
cases  NP-completeness  has  acted  as  an  open-ended  research  paradigm,  spawn¬ 
ing  variants  that  are  appropriate  for  the  computational  context  being  studied; 
examples  are  classes  that  capture  local  search  [8],  the  parity  argument  [14],  loga¬ 
rithmic  nondeterminism  [18],  the  related  concept  of  fixed-parameter  tractability 
[4],  and  approximability  [17]. 

10.  Complexity  classes  introduced  this  way,  as  abstractions  of  natural  compu¬ 
tational  problems  of  mysteriously  intermediate  complexity,  are  in  some  precise 
sense  well-motivated,  indeed  necessary;  they  are  discovered,  not  invented,  as  they 
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have  always  existed  by  dint  of  their  natural  complete  problems.  The  only  way  to 
make  them  go  away  is  to  collapse  them  with  P  or  NP  — as  occasionally  happens, 
recall  [17]  and  its  brilliant  follow-up  [1]. 

11.  NP-completeiiess  is  of  course  a  valuable  tool  for  demonstrating  the  difficulty 
of  computational  problems.  However,  NP-completeness  is  often  used  “allegori¬ 
cally;”  a  problem  is  shown  NP-complete  that  is  not,  strictly  speaking,  a  natural 
computational  problem,  but  an  artificial  problem  created  to  capture  a  mathe¬ 
matical  concept.  NP-completeness  in  this  context  suggests  that  a  problem,  area, 
or  approach  is  mathematically  nasty..  Because,  if  we  believe  that  efficient  algo¬ 
rithms  are  the  natural  outflow  of  the  mathematical  structure  of  a  problem  (a  view 
shared  by  all  computer  scientists,  with  the  possible  exception  of  researchers  in 
“metaphor-based”  algorithmic  paradigms  such  as  neural  nets,  in  which  algorith¬ 
mic  behavior  is  thought  to  be  “emergent”),  then,  contrapositively,  complexity 
must  be  the  manifestation  of  mathematical  poverty,  lack  of  structure.  See  [7]  for 
an  early  example  of  such  a  use  of  NP-completeness  in  the  theory  of  relational 
databases. 

12.  Beyond  mathematics,  NP-completeness  (and  complexity  in  general)  can  also 
be  applied  “allegorically”  in  other  disciplines.  It  can  be  used  as  a  metaphor 
for  chaos  in  dynamical  systems,  for  unbounded  rationality  in  game  theory,  for 
unfairness  in  economics,  for  integrity  of  electoral  systems  in  political  science, 
for  cognitive  implausibility  in  artificial  intelligence,  for  genetic  indeterminism  in 
genetics,  and  so  on  (see  [16]  for  references). 

13.  NP-completeness  is  thus  an  important  “intellectual  export”  of  computer 
science  to  other  disciplines.  And  it  does  fill  a  void  in  the  interdisciplinary  intel¬ 
lectual  trade:  It  seems  to  me  that  the  concept  of  lower  bounds  —and  negative 
results  in  general —  is  particular  to  computer  science,  and  has  no  well-developed 
counterpart  in  other  disciplines.  True,  one  sees  isolated  results  in  other  sciences 
(such  as  Heisenberg’s  uncertainty  principle  in  quantum  mechanics.  Arrow’s  im¬ 
possibility  theorem  in  economics,  and  Carnot’s  theorem  in  thermodynamics) 
which  are  arguably  negative;  however,  nowhere  else  in  science  does  one  find  such 
a  comprehensive  methodology  for  obtaining  negative  results  (with  the  exception 
of  complexity’s  own  precursor  mathematical  logic,  with  its  many  incomplete¬ 
ness,  undecidability,  and  inexpressibility  results).  NP-completeness  is  therefore 
valuable  for  another  reason:  It  is  one  of  the  few  precious  features  which  give  our 
science  its  special  character,  which  set  it  apart  from  the  other  sciences  (see  [15] 
for  another  development  of  this  argument). 

14.  In  science,  successful  ideas  are  those  that  are  pervasive  and  invasive,  are 
invitingly  elegant  and  methodical,  are  open  to  extensions  and  variants,  and  cap¬ 
ture  an  objective  necessity,  answer  a  widespread  but  diffuse  sense  of  dissatisfac¬ 
tion  in  the  scientific  community  (in  the  case  of  NP-completeness,  the  widespread 
feeling  among  computer  scientists  in  the  1960s  that  automata  theory,  the  previ¬ 
ous  great  paradigm,  had  run  its  course  as  a  useful  abstraction  of  computation). 
Thinking  about  the  nature  and  history  of  NP-completeness  could  give  us  useful 
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hints  about  computer  science’s  next  great  paradigm,  which,  for  all  I  know,  has 

started  being  articulated  somewhere  else  in  this  volume. 
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Abstract.  We  give  an  overview  of  the  LEDA  platform  for  combinatorial 
and  geometric  computing  and  an  account  of  its  development.  We  discuss 
our  motivation  for  building  LEDA  and  to  what  extent  we  have  reached 
our  goals.  We  also  discuss  some  recent  theoretical  developments.  This 
paper  contains  no  new  technical  material.  It  is  intended  as  a  guide  to 
existing  publications  about  the  system.  We  refer  the  reader  also  to  our 
web-pages  for  more  information. 


1  What  is  LEDA? 

LEDA  [MN95,  MNU96]  aims  at  being  a  comprehensive  software  platform  for 
combinatorial  and  geometric  computing.  It  provides  a  sizable  collection  of  data 
types  and  algorithms.  This  collection  includes  most  of  the  data  types  and  algo¬ 
rithms  described  in  the  text  books  of  the  area  ([AHU83,  Meh84,  Tar83,  CLR90, 
0’R94,  Woo93,  Sed91,  Kin90,  van88,  NH93]).  In  particular,  it  includes  stacks, 
queues,  lists,  sets,  dictionaries,  ordered  sequences,  partitions,  priority  queues, 
directed,  undirected,  and  planar  graphs,  lines,  points,  planes,  and  polygons,  and 
many  algorithms  in  graph  and  network  theory  and  computational  geometry, 
e.g.,  shortest  paths,  matchings,  maximum  flow,  min  cost  flow,  planarity  testing, 
spanning  trees,  biconnected  and  strongly  connected  components,  segment  in¬ 
tersection,  convex  hulls,  Delaunay  trianguiations,  and  Voronoi  diagrams.  LEDA 
supports  applications  in  a  broad  range  of  areas.  It  has  already  been  used  in 
such  diverse  areas  as  code  optimization,  VLSI  design,  graph  drawing,  graphics, 
robot  motion  planning,  traffic  scheduling,  machine  learning  and  computational 
biology. 

We  discuss  different  aspects  of  the  LEDA  system. 

Ease  of  Use:  The  library  is  easy  to  use.  In  fact,  only  a  small  fraction  of  our  users 
are  algorithms  experts  and  many  of  our  users  are  not  even  computer  scientists. 
For  these  users  the  broad  scope  of  the  library,  its  ease  of  use,  and  the  correctness 
and  efficiency  of  the  algorithms  in  the  library  are  crucial. 

*  Max-Planck-lnstitut  fiir  Informatik,  Im  Stadtwald,  66123  Saarbriicken, 
www.mpi-sb.nipg  .de/~mGhlhorn 

**  Martin-Luther-Universitat  Halle-Wittenberg,  FB  Mathematik  und  Informatik, 
Weinbergweg  17,  060099  Halle,  www.informatik.uni-halle.de/~naeher 
LEDA  Software  GmbH  66123  Saarbriicken,  www.mpi-sb.mpg.de/LEDA/leda.html 
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The  LEDA  manual  [MNU96]  gives  precise  and  readable  specifications  for 
the  data  types  and  algorithms  mentioned  above.  The  specifications  are  short 
(typically  not  more  than  a  page) ,  general  (so  as  to  allow  several  implementations) 
and  abstract  (so  as  to  hide  all  details  of  the  implementation). 

ExtendihtUty:  Combinatorial  and  geometric  computing  is  a  diverse  area  and 
hence  it  is  impossible  for  a  library  to  provide  ready-made  solutions  for  all  appli¬ 
cation  problems.  For  this  reason  it  is  important  that  LEDA  is  easily  extendible 
(see  also  section  4.4)  and  can  be  used  as  a  platform  for  further  software  devel¬ 
opment.  In  many  cases  LEDA  programs  are  very  close  to  the  typical  text  book 
presentation  of  the  underlying  algorithms.  The  goal  is  the  equation 

Algorithm  -f  LEDA  =  Program. 

We  give  an  example.  Dijkstra’s  shortest  path  algorithm  takes  a  directed  graph 
G  -  (E,  E),  a  node  s  e  K,  called  the  source,  and  a  non-negative  cost  function  on 
the  edges  cost  :  E  R>o-  It  computes  for  each  node  v  £  V  the  distance  from 
s.  A  typical  text  book  presentation  of  the  algorithm  is  as  follows. 


set  dist(s)  to  0. 

set  dist(v)  to  infinity  for  v  different  from  s. 
declare  all  nodes  unreached. 
while  there  is  an  unreached  node 

{  let  u  be  an  unreached  node  with  minimal  dist-value.  (*) 

declare  u  reached. 

forall  edges  e  =  (u,v)  out  of  u 

set  dist(v)  =  min(  dist(v),  dist  (u)  +  cost(e)  ) 

> 


The  text  book  presentation  will  then  continue  to  discuss  the  implementation  of 
line  (*).  It  will  state  that  the  pairs  {(u,  dist{v));v  unreached}  should  be  stored 
in  a  priority  queue,  e.g,,  a  Fibonacci  heap,  because  this  will  allow  the  selection 
of  an  unreached  node  with  minimal  distance  value  in  logarithmic  time.  It  will 
probably  refer  to  some  other  chapter  of  the  book  for  a  discussion  of  priority 
queues. 

We  now  give  the  corresponding  LEDA  program;  it  is  very  similar  to  the 
presentation  above. 

tinclude  <LEDA/graph.h> 

#include  <LEDA/node_pq.h> 

void  DIJKSTRA (const  graph  &G,  node  s,  const  edge_array<double>&  cost, 

node_array<double>&  dist) 

{  node_pq<double>  PQ(G); 
node  v; 
edge  e; 

f orall_nodes (v , G) 
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{  if  (v  ==  s)  dist[v]  =  0;  else  dist[v]  =  MAXDOUBLE; 

PQ . insert (v,dist [v] ) ; 

} 

while  (  !PQ. empty 0  ) 

{  node  u  =  PQ .del_min() ; 
f orall_adj_edges(e ,u) 

{  V  =  target (e) ; 

double  c  =  dist [u]  +  cost  [e] ; 
if  (  c  <  dist  [v]  ) 

{  PQ.decrease_inf (v,c) ;  dist [v]  =  c;  } 

} 

} 

} 

We  start  by  including  the  graph  and  the  node  priority  queue  data  type.  We  use 
edge^arrays  and  node-arrays  (arrays  indexed  by  edges  and  nodes  respectively) 
for  the  functions  cost  and  dist.  We  declare  a  priority  queue  PQ  for  the  nodes  of 
graph  G.  It  stores  pairs  (r;,  dist[v])  and  is  empty  initially.  The  f  orall_nodes-loop 
initializes  dist  and  PQ.  In  the  main  loop  we  repeatedly  select  a  pair  (u,  dist[u]) 
with  minimal  distance  value  and  then  scan  through  all  adjacent  edges  to  update 
distance  values  of  neighboring  vertices. 

Correctness:  We  try  to  make  sure  that  the  programs  in  LED  A  are  correct. 
We  start  from  correct  algorithms,  we  document  our  implementations  carefully 
(at  least  recently),  we  test  them  extensively,  and  we  have  developed  program 
checkers  (see  subsection  4.1)  for  some  of  them.  We  want  to  emphasize  that 
many  of  the  algorithms  in  LEDA  are  quite  intricate  and  therefore  non-trivial 
to  implement.  In  the  combinatorial  domain  it  is  frequently  possible  to  obtain 
a  correct  implementation  by  sacrificing  efficiency,  e.g.,  by  using  linear  search  in 
the  realization  of  a  dictionary.  In  the  geometric  domain  it  is  usually  difficult  to 
obtain  a  correct  implementation  even  if  efficiency  plays  no  role.  This  is  due  to  the 
so-called  degeneracy  and  precision  problem  [MN94].  The  geometric  algorithms  in 
LEDA  use  exact  arithmetic  and  are  therefore  free  from  failures  due  to  rounding 
errors.  Moreover,  they  can  handle  all  degenerate  cases. 

Efficiency:  LEDA  contains  the  most  efficient  realizations  known  for  its  types. 
For  many  data  types  the  user  may  even  choose  between  different  implementa¬ 
tions,  e.g.,  for  dictionaries  he  may  choose  between  fl6-trees,  BB[a]-tvees,  dynamic 
perfect  hashing,  and  skip  lists.  The  declarations 

dictionary<string, int>  Dl; 
dictionary<string,int ,skip_list>  D2 ; 


declare  Dl  as  a  dictionary  from  string  to  mt  with  the  default  implementation 
and  select  the  skip  list  implementation  for  D2. 
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Availability  and  Usage:  LED  A  is  realized  in  C++  and  runs  on  many  different 
platforms  (Unix,  Windows95,  Windows  NT,  OS/2)  with  many  different  compil¬ 
ers. 

LEDA  is  now  used  at  more  than  1500  academic  sites.  Academic  use  is  free,  see 
http://www.mpi-sb.mpg.de/LEDA/leda.html.  A  commercial  version  of  LEDA 
is  marketed  LEDA  Software  GmbH.  There  are  license  holders  in  the  telecommu¬ 
nication  industry  (ATR  (Japan),  Comptel  (Finland),  E-Plus  (Germany),  France 
Telecom  (France),  MCI  (USA)),  in  the  graphics  industry  (Aristo  Technolo¬ 
gies  (USA),  Cadabra  (Canada),  Compass  Design  (USA),  Fuji  (Japan),  Men¬ 
tor  Graphics  (USA),  MUS  (Germany)),  in  the  automotive  industrie  (Daimler 
Benz  (Germany),  Ford  (USA),  Honda  (Japan)),  in  the  computer  industry  (DEC 
(USA),  IBM  (USA),  Siemens  AG  (Germany),  Silicon  Graphics  (USA),  SUN 
(USA)),  and  other  industries  (Chevron  (USA),  CFP  (Germany),  Dolphin  (The 
Netherlands),  Howmedica  (Germany),  Lufthansa  (Germany),  Neovista  (USA), 
Prediction  (USA),  Sony  (Japan),  VTT  (Finland)). 

History:  We  started  the  project  in  the  fall  of  1988.  We  spent  the  first  6  months 
on  specifications  and  on  selecting  our  implementation  language.  Our  test  cases 
were  priority  queues,  dictionaries,  partitions,  and  algorithms  for  shortest  paths 
and  minimum  spanning  trees.  We  came  up  with  the  item  concept  as  an  abstrac¬ 
tion  of  the  notion  “pointer  into  a  data  structure” .  It  worked  successfully  for  the 
three  data  types  mentioned  above  and  we  are  now  using  it  for  most  data  types 
in  LEDA.  Concurrently  with  searching  for  the  correct  specifications  we  inves¬ 
tigated  several  languages  for  their  suitability  as  our  implementation  platform. 
We  looked  at  Smalltalk,  Modula,  Ada,  Eiffel,  and  C++.  We  wanted  a  language 
that  supported  abstract  data  types  and  type  parameters  (polymorphism)  and 
that  was  widely  available.  We  wrote  sample  programs  in  each  language.  Based 
on  our  experiences  we  selected  C++  because  of  its  flexibility,  expressive  power, 
and  availability.  We  are  even  more  convinced  now  that  our  choice  was  the  right 
one. 

A  first  publication  about  LEDA  appeared  in  MFCS  1989  (Lecture  Note  in 
Computer  Science,  Volume  379)  and  ICALP  1990  (Lecture  Notes  in  Computer 
Science,  Volume  443).  Stefan  Naher  became  the  head  of  the  LEDA  project  and 
he  is  the  main  designer  and  implementer  of  LEDA. 

In  the  second  half  of  1989  and  during  1990  Stefan  Naher  implemented  a 
first  version  of  the  combinatorial  part  {—  data  structures  and  graph  algorithms) 
of  LEDA  (Version  1.0).  Version  2.0  allowed  to  use  arbitrary  data  types  (not 
only  pointer  and  simple  types)  as  actual  type  parameters  of  parameterized  data 
types.  It  included  a  first  implementation  of  the  two-dimensional  geometry  library 
(libP)  and  an  interface  to  the  X-Window  system  for  graphical  input  and  output 
(data  type  window).  Version  3.0  switched  to  the  template  mechanism  to  real¬ 
ize  parameterized  data  types  (macro  substitution  was  used  before),  introduced 
implementation  parameters  that  allow  to  choose  between  different  implementa¬ 
tions,  extended  the  LEDA  memory  management  system  to  user-defined  classes, 
and  further  improved  the  efficiency  of  many  data  types  and  algorithms.  Version 
3.1  provided  a  more  efficient  graph  data  type  and  contained  new  data  types 
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(arbitrary  precision  number  types  and  basic  geometric  objects)  used  for  robust 
implementations  of  geometric  algorithms  and  Versions  3.2  and  3.3  contained 
more  geometry  and  new  tools  for  documentation  and  manual  production. 

LEDA  Software  GmbH  was  founded  in  early  1995. 

2  Why  did  we  build  LEDA? 

We  had  four  main  reasons: 

1.  We  had  always  felt  that  a  significant  fraction  of  the  research  done  in  the 
algorithms  area  was  eminently  practical.  However,  only  a  small  part  of  it 
was  actually  used.  We  frequently  heard  from  our  former  students  that  the 
effort  needed  to  implement  an  advanced  data  structure  or  algorithm  is  too 
large  to  be  cost-effective.  We  concluded  that  algorithms  research  must  include 
implementation  if  the  field  wants  to  have  maximum  impact. 

2.  Even  within  our  own  research  group  we  found  different  implementations  of 
the  same  balanced  tree  data  structure.  Thus  there  was  constant  reinvention 
of  the  wheel  even  within  our  own  tight  group. 

3.  Many  of  our  students  had  implemented  algorithms  for  their  master’s  thesis. 
Work  invested  by  these  students  was  usually  lost  after  the  students  gradu¬ 
ated.  We  had  no  depository  for  implementations, 

4.  The  specifications  of  advanced  data  types  which  we  gave  in  class  and  which 
we  found  in  text  books,  including  the  one  written  by  one  of  the  authors,  were 
incomplete  and  not  sufficiently  abstract.  They  contained  phrases  of  the  form: 
“Given  a  pointer  to  a  node  in  the  heap  its  key  can  be  decreased  in  constant 
amortized  time” .  This  implied  that  a  user  of  a  data  structure  had  to  have 
knowledge  of  its  implementation.  As  a  consequence  combining  implementa¬ 
tions  was  a  non-trivial  task.  A  case  in  point  is  the  shortest  path  problem  in 
graphs.  We  taught  priority  queues  in  the  early  weeks  of  an  algorithm  course 
and  Dijkstra’s  algorithm  for  the  shortest  path  problem  in  later  weeks.  Our 
students  found  it  difficult  to  combine  the  programs. 

The  goal  of  the  LEDA  project  is  to  overcome  these  shortcomings  by  creating  a 
platform  for  combinatorial  and  geometric  computing.  The  LEDA  library  should 
contain  the  major  findings  of  the  algorithms  community  in  a  form  that  makes 
them  directly  accessible  to  non-experts  having  only  a  limited  knowledge  in  the 
area.  In  this  way  we  hoped  to  reduce  the  gap  between  research  and  application. 

3  Did  we  achieve  our  goals? 

We  believe  that  we  have  reached  the  last  goal  and  have  at  least  partially  reached 
the  first  three  goals. 

LEDA  was  first  distributed  in  the  summer  of  1990.  Its  user  community  has 
grown  ever  since.  LEDA  is  now  used  at  more  than  1500  academic  and  industrial 
sites  in  over  50  different  countries  world-wide.  Industrial  use  started  in  1994. 
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Many  users  of  LED  A  are  outside  computer  science  and  only  a  small  fraction  of 
our  users  are  from  the  algorithms  community.  We  therefore  believe  that  we  have 
reached  our  first  two  goals.  The  impact  of  algorithms  research  has  increased  and 
there  is  considerable  use  of  LEDA  and  hence  reuse  of  implementations.  However, 
the  gap  between  algorithms  research  and  algorithms  use  is  still  quite  large.  In 
particular,  many  of  the  non-expert  users  of  LEDA  complain  that  a  tutorial  is 
missing.  We  hope  that  the  forthcoming  LEDAbook  [MN]  will  help. 

We  have  also  partially  achieved  our  third  goal.  We  now  do  have  a  depository 
for  our  students  work  and  we  have  just  introduced  the  concept  of  LEDA  exten¬ 
sion  packages  (LEPs)  that  will  allow  a  wider  community  to  contribute.  We  come 
back  to  LEPs  in  section  4.4. 

We  have  achieved  our  last  goal.  The  specifications  of  our  data  types  are 
sufficiently  abstract  and  precise  so  as  to  allow  their  combination  without  any 
knowledge  of  implementation.  We  have  seen  an  example  in  section  1.  Many  of 
our  specifications  are  based  on  the  so-called  item  concept  which  gives  an  abstract 
treatment  of  pointers  into  a  data  structure.  Different  components  of  LEDA  can 
be  combined  without  knowledge  of  the  implementation. 

The  project  also  had  a  number  of  positive  side-effects  which  we  did  not  fore¬ 
see.  Firstly,  LEDA’s  wide  use  gives  us  tremendous  satisfaction^.  Secondly,  our 
experiences  with  the  system  suggested  many  difficult  and  well  motivated  prob¬ 
lems  for  theoretical  algorithms  research.  We  will  discuss  program  checking,  run¬ 
ning  time  prediction,  and  theoretical  issues  in  the  implementation  of  geometric 
algorithms  below.  The  system  has  changed  the  way  we  do  algorithms  research. 

4  Recent  developments 

A  strength  of  the  LEDA  project  is  its  strong  theoretical  underpinning.  We  believe 
that  only  our  strong  theoretical  background  allowed  us  to  build  LEDA.  In  the 
last  two  years  we  paid  particular  attention  to  program  checking,  running  time 
prediction,  and  the  correct  implementation  of  geometric  programs. 


4.1  Program  checking 

Programming  is  a  notoriously  error-prone  task;  this  is  even  true  when  program¬ 
ming  is  interpreted  in  a  narrow  sense:  going  from  a  (correct)  algorithm  to  a 
program.  The  standard  way  to  guard  against  coding  errors  is  program  testing. 
The  program  is  exercised  on  inputs  for  which  the  output  is  known  by  other 
means,  typically  as  the  output  of  an  alternative  program  for  the  same  task. 
Program  testing  has  severe  limitations: 

-  It  is  usually  only  done  during  the  testing  phase  of  a  program.  Also,  it  is 
difficult  to  determine  the  “correct”  suite  of  test  inputs. 

We  stated  above  that  algorithms  research  must  include  implementation  to  have  max¬ 
imal  impact.  We  might  add:  without  implementation  algorithm  research  is  less  re¬ 
warding. 
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—  Even  if  appropriate  test  inputs  are  known  it  is  usually  difficult  to  determine 
the  correct  outputs  for  these  inputs:  alternative  programs  may  have  different 
input  and  output  conventions  or  may  be  too  inefficient  to  solve  the  test  cases. 

Given  that  program  verification,  i.e.,  formal  proof  of  correctness  of  an  imple¬ 
mentation,  will  not  be  available  on  a  practical  scale  for  some  years  to  come, 
program  checking  has  been  proposed  as  an  extension  to  testing  [BK89,  BLR90]. 
The  cited  papers  explored  program  checking  in  the  area  of  algebraic,  numerical, 
and  combinatorial  computing.  In  [MNS'^96,  MM95,  HMN96]  we  discuss  pro¬ 
gram  checkers  for  planarity  testing  and  a  variety  of  geometric  tasks.  We  have 
also  added  program  checkers  to  some  of  the  LEDA  programs,  e.g.,  the  planarity 
test  provides  a  planar  drawing  for  a  planar  graph  and  a  Kuratowski  subgraph 
for  a  non-plan ar  graph.  A  user  of  the  planarity  algorithm  has  thus  the  possibility 
to  verify  that  the  output  of  the  algorithm  is  correct. 


4.2  Running  Time  Prediction 

Big-0  analysis  of  algorithms  is  concerned  with  the  asymptotic  analysis  of  algo¬ 
rithms,  i.e.,  with  the  behavior  of  algorithms  for  large  inputs.  It  does  not  allow 
the  prediction  of  actual  running  times  of  real  programs  on  real  machines  and 
therefore  its  predictive  value  is  limited. 

-  An  algorithm  with  running  time  0{n)  is  faster  than  an  algorithm  with  run¬ 
ning  time  O(n^)  for  sufficiently  large  n.  Is  n  =  10®  large  enough?  Asymptotic 
analysis  of  algorithms  is  of  little  help  to  answer  this  question.  It  is  however 
true  that  a  well-trained  algorithms  person  who  knows  program  and  analysis 
can  make  a  fairly  good  guess. 

-  For  a  user  of  LEDA  statements  of  asymptotic  running  times  are  almost 
meaningless  as  he/she  has  no  way  to  estimate  the  constants  involved.  After 
all,  the  purpose  of  LEDA  is  to  hide  the  implementations  from  our  users. 

The  two  items  above  clearly  indicate  that  we  need  more  than  asymptotic 
analysis  in  order  to  have  a  theory  with  predictive  value.  The  ultimate  goal  of 
analysis  of  algorithms  must  be  a  theory  that  allows  to  predict  the  actual  running 
time  of  an  actual  program  on  an  actual  machine  with  reasonable  precision  (say 
within  a  factor  of  two).  We  must  aim  for  the  following  scenario:  When  a  program 
is  installed  on  a  particular  machine  a  certain  number  of  well-chosen  tests  are 
executed  in  order  to  learn  about  machine  parameters  relevant  for  the  execution 
of  the  program.  This  knowledge  about  the  machine  is  combined  with  the  analysis 
of  the  algorithm  to  predict  running  time  on  specific  inputs.  In  the  context  of  an 
algorithms  library  one  could  even  hope  to  replace  statements  about  asymptotic 
execution  times  by  statements  about  actual  execution  times  during  installation  of 
the  library.  In  [FM97]  we  show  for  a  small  number  of  programs  (Fibonacci  heaps, 
Dijkstra’s  shortest  path  algorithm,  and  a  maximum  weight  matching  algorithm) 
that  running  time  prediction  within  a  factor  of  less  than  two  and  a  wide  range 
of  machines  is  feasible. 
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4.3  Implementation  of  geometric  algorithms 

Geometric  algorithms  are  frequently  formulated  under  two  unrealistic  assump¬ 
tions:  computers  are  assumed  to  use  exact  real  arithmetic  (in  the  sense  of 
mathematics)  and  inputs  are  assumed  to  be  in  general  position.  The  naive 
use  of  floating  point  arithmetic  as  an  approximation  to  exact  real  arith¬ 
metic  very  rarely  leads  to  correct  implementations.  In  a  sequence  of  papers 
[BMS94a,  See94,  MN94,  BMS94b,  FGK+96,  BRMS97]  we  investigated  the  de¬ 
generacy  and  precision  issues  and  extended  LEDA  based  on  our  theoretical  work. 
LEDA  now  provides  exact  geometric  kernels  for  two-dimensional  and  higher 
dimensional  computational  geometry  [MMN'*'97]  and  also  correct  implementa¬ 
tions  for  basic  geometric  tasks,  e.g.,  two-dimensional  convex  hulls,  Delaunay  di¬ 
agrams,  Voronoi  diagrams,  point  location,  line  segment  intersection,  and  higher¬ 
dimensional  convex  hulls  and  Delaunay  diagrams. 

4.4  LEDA  Extension  Packages 

LEDA  extension  packages  are  a  new  feature  of  the  LEDA  project  structure. 
Up  to  two  years  ago,  most  of  LEDA  has  been  developed  by  a  small  group  of 
persons  under  the  tight  supervision  of  Stefan  Naher;  no  code  went  into  the  system 
that  was  not  thoroughly  understood  by  either  Stefan  Naher  or  Christian  Uhrig. 
The  growing  numbers  of  contributors  and  the  fact  that  Stefan  Naher  has  new 
responsibilities  as  a  professor  has  forced  us  to  a  change  of  the  project  structure. 
We  decided  to  split  LEDA  into  a  core  system  (the  actual  LEDA  version)  and  to 
shift  enhancements  into  additional  software  packages. 

LEDA  extension  packages  (LEPs)  extend  LEDA  into  particular  application 
domains  and  areas  of  algorithmics  not  covered  by  the  core  system.  LEDA  ex¬ 
tension  packages  satisfy  requirements,  which  guarantee  compatibility  with  the 
LEDA  philosophy.  LEPs  have  a  LEDA-style  documentation,  they  are  imple¬ 
mented  as  platform  independent  as  possible  and  the  installation  process  allows 
a  close  integration  into  the  LEDA  core  library. 

Currently,  there  are  no  released  LEPs  available,  but  there  are  several  LEP  un¬ 
der  construction:  PQ-trees  (coordinated  by  Sebastian  Leipert,  Koeln),  dynamic 
graph  algorithms  (coordinated  by  David  Alberts,  Halle),  the  homogeneous  pla¬ 
nar  CGAL  geokernel  (coordinated  by  Stefan  Schirra,  Saarbriicken),  a  homoge¬ 
neous  d-dimensional  geokernel  (coordinated  by  Michael  Seel,  Saarbriicken),  and 
a  library  for  graph  drawing  (DFG-project  Automatisches  Graphenzeichnen). 
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Abstract.  We  present  a  unified  treatment  of  the  hierarchy  defined  by 
Klaus  Wagner  for  w-rational  sets  and  also  introduced  in  the  more  general 
framework  of  descriptive  set  theory  by  William  W.  Wadge.  We  show  that 
this  hierarchy  can  be  defined  by  syntactic  invariants,  using  the  concept 
of  an  w- semi  group. 


1  Introduction 

The  idea  of  a  Muller  automaton  was  introduced  by  David  Muller  as  a  variant 
of  usual  finite  automata,  well  suited  for  the  recognition  of  infinite  sequences.  It 
was  later  proved  by  McNaughton  that  any  recognizable  set  of  a;-words  can  be 
recognized  by  a  deterministic  Muller  automaton. 

Klaus  Wagner  has  introduced  in  1979  [22]  two  concepts  defined  on  Muller  au¬ 
tomata:  chains  and  superchains.  Together  with  an  operation  on  automata  called 
derivation,  he  has  proved  that  the  maximal  lengths  of  chains  and  superchains 
(and  the  ones  obtained  on  the  derived  automata)  are  enough  to  characterize  the 
classes  of  recognizable  a;-sets  up  to  to  the  inverse  image  under  a  continuous  func¬ 
tion.  This  classification  has  also  been  investigated  independently  by  W.  Wadge. 
He  has  studied  the  reduction  by  a  continuous  function  in  abstract  topological 
spaces,  as  a.  refinement  of  the  classical  Borel  hierarchy.  His  results  are  based 
on  a  particular  class  of  games,  now  called  Wadge  games.  His  classification  itself 
is  known  as  the  Wadge  hierarchy  [10].  The  connections  between  both  theories 
were  first  discovered  by  Pierre  Simonnet  [19].  The  Wagner  hierarchy  has  been 
partially  rediscovered  several  times  [2,  9].  The  interest  in  the  classification  of 
u;-rational  sets  was  revived  by  the  studies  concerning  the  logic  of  distributed 
processing  [15]. 

Since  then  Thomas  Wilke  [24]  has  shown  how  one  could  use,  in  the  case  of 
infinite  words,  algebraic  methods  allowing  to  replace  finite  automata  by  finite 
semigroups.  This  has  lead  to  the  notion  of  an  u;-semigroup  introduced  in  [17]. 
This  approach  has  the  advantage  to  make  easier  the  definition  of  a  variety  along 
the  line  of  Eilenberg’s  theory. 

Another  direction  was  investigated  by  Jean-Eric  Pin  in  [18].  He  has  shown 
that  the  notion  of  ordered  semigroup  could  be  used  to  define  families  of  rec¬ 
ognizable  sets  that  are  not  closed  under  complementation.  This  is  especially 
interesting  in  the  case  of  infinite  words  since  very  natural  families  like  the  open 
sets  are  not  closed  under  complementation. 
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We  would  like  to  show  here  how  Klaus  Wagner’s  ideas  fit  into  the  present 
framework  using  a;-semigroups.  In  particular,  we  shall  see  that  the  definition  of 
chains  and  superchains  can  be  formulated  in  cu-semigroups,  providing  a  clear 
explanation  of  the  fact  that  they  do  not  depend  on  the  particular  automaton 
used  to  recognize  a  given  set  but  on  the  set  itself.  We  shall  show  how  the  classes 
of  the  Wagner  hierarchy  are  defined  in  topological  terms.  We  will  also  investigate 
the  link  between  Wagner’s  notions  and  that  of  ordered  semigroups. 

The  work  presented  here  is  based  on  results  obtained,  in  great  part,  in  the 
first  author  doctoral  thesis  [4].  Part  of  it  was  presented  at  a  conference  held  in 
Porto  [6].  Those  concerning  the  equivalence  of  the  various  definitions  of  chains 
and  superchains  will  appear  soon  in  [7].  The  ones  concerning  the  hierarchy  itself 
will  be  published  in  a  second  paper  [5]. 

2  Preliminaries 

We  assume  a  familiarity  with  the  basic  concepts  of  a;-rational  sets  and  automata. 
For  an  introduction,  we  refer  the  reader  to  [21]  or  [16].  A  word  about  notation. 
The  alphabet  is  usually  denoted  by  the  symbol  A.  The  set  A  (resp.  A”*”)  is 
the  set  of  finite  words  (resp.  nonempty  finite  words)  on  the  alphabet  A.  The 
set  of  (one-sided)  infinite  words  on  A  is  denoted  by  A"^ .  We  consider  A"^  as  a 
topological  space  with  the  usual  Cantor  topology. 

We  shall  deal  often  with  classes  of  sets.  Since  the  sets  considered  are  subsets 
of  the  topological  space  A^ ,  a  class  of  sets  is  really  a  mapping  assigning  to  each 
alphabet  A  a  set  of  subsets  of  A^ .  The  dual  class  of  a  class  F  is  formed  of  the 
complements  (within  each  A^)  of  the  sets  in  T.  It  is  denoted  by  T.  We  say  that 
r  is  ambiguous  if  F  =  T. 

We  shall  use  ordinals  to  index  classes  of  sets.  The  symbol  u  will  thus  be  used 
in  two  ways,  either  to  denote  an  ordinal  in  expressions  like  +  1  or  to  denote 
an  cj-rational  set  like  (a* 6)^.  We  hope  that  it  will  not  bring  confusion. 

We  now  recall  the  definition  of  u^-semigroups  and  Wilke  algebras.  For  a  more 
detailed  presentation,  we  refer  the  reader  to  [17].  We  assume  some  familiarity 
with  the  basic  notions  of  semigroup  theory.  We  use  the  notation  of  [8]  for  all 
undefined  notions  in  semigroup  theory.  We  use  the  traditional  notation  to 
denote  the  semigroup  obtained  by  adding  an  new  neutral  element  1  to  S. 

An  cj-semigroup  is  a  pair  S  =  where  is  a  semigroup  and  Su,  is 

a  set  with  two  operations  in  addition  to  the  semigroup  operation  of  S+:  A  left 
action  of  5-|-  on  Su>  ' 

(s,  If)  ^  s.u 

and  an  infinite  product 

TT  :  5-1-  X  X  5-j-  X  ...—>■ 

These  operations  must  satisfy  the  following  axioms: 

1.  The  action  of  S-\-  on  S^^  is  associative:  for  6  and  u  ^  S^} 


s.[t.u)  —  {st).u 
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2.  The  infinite  product  is  cj-associative,  in  the  sense  that  for  any  sequence 
('577)ji>o  of  elements  of  5+  and  any  strictly  increasing  sequence  (nj)i>o  of 
integers  with  no  =  0,  one  has 

7r(5o,si,S2,...)  =  7r{to,ti,t2- • -) 

with  ti  = 

3.  The  left  action  is  compatible  with  the  infinite  product:  for  elements  s  and 

of  one  has 

5.7r(so,  Sl,  S2,  .  .  .)  =  7r(s,  5o,  Sl,  52,  .  .  .) 

The  associativity  of  the  operations  allows  one  to  denote  all  operations  by  mere 
concatenation,  with  su  instead  of  s.u  and  5iS2  .  . .  instead  of  7r(5i ,  S2,  •  •  •)• 

An  cj-semigroup  morphism  from  S  =  (5-1-,  into  S'  —  is  a  pair 

(fuj)  where  is  semigroup  morphism  from  5+  into  S'^  and  is  a  function 
from  S^^,  into  S'^,  which  is  compatible  with  the  cj-semigroup  structure,  be.,  the 
left  action  and  the  infinite  product. 

Thus  an  a;-semigroup  is  not  an  algebra  in  the  usual  sense  since  one  of  its 
operations  has  infinitely  many  arguments. 

The  concepts  of  rational  expression  and  of  u-rational  expressions  extend  to 
u;-semigroups  in  the  following  way.  Let  5  be  a  semigroup  and  X  be  a  subset  of  S. 
We  denote  by  the  subsemigroup  generated  by  X  in  S.  We  denote  by  X*  the 
subset  of  defined  by  X*  =  {1}  +  In  this  way,  for  any  s  G  5  and  X  C  S, 
both  subsets  sX*  and  X* s  are  defined  as  subsets  of  S.  Let  now  5  =  (5-f.,  S^)  be 
an  cj-semigroup.  For  X^Y  C  5-(-,  we  denote  by  XY^  the  set 

XY^  ^{xy,y2...\xeX,yieY} 

We  further  introduce  a  variant  of  cj-semigroups  which  is  an  algebra  in  the 
usual  sense  since  all  its  operations  have  finite  arity  and  is  well  suited  to  describe 
finite  cj-semigroups.  This  concept  is  due  to  Wilke  [23,  24]. 

A  Wilke  algebra  is  a  pair  S  —  (5+,  Su;)  where  5-^  is  a  semigroup  and  Su;  is  a 
set  with  tw’o  operations:  A  left  action  of  5+  on  Su;  and  a  unary  operation  from 
5+  into  Su.,  denoted 

t  ^  r 

The  operation  lo  must  satisfy  the  following  axioms: 

(^n)w  ^ 

s{ts)^  =  [st)^ 


for  all  s,t  and  n  >  1. 

A  Wilke  algebra  morphism  is  a  pair  of  functions  compatible  with  the  Wilke 
algebra  structure. 

A  well-known  version  of  Ramsey  theorem  says  that  if  we  define  a  coloring 
ip  :  A'^  S  of  all  words  using  only  a  finite  number  of  colors,  then  each  cj-word 
has  a  factorization: 


X  =  VqViV2  ■  ■  ■ 
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with  all  blocks  except  those  involving  the  first  one  of  the  same  color,  i.e.,  such 
that  (p(viVi+i  ...Vi+k)  -  for  bi  >  I,  kj  >  0.  This  for¬ 

mulation  holds  even  if  the  set  of  colors  is  a  finite  set  without  a  multiplicative 
structure.  In  the  case  where  S  is  a  finite  semigroup  and  (p  a  semigroup  morphism, 
the  result  implies  that  for  any  cj-word  x  there  is  a  pair  of  an  element  s  G  S  and 
an  idempotent  e  =  G  5  such  that  s  =  se  and  x  G  (p~'^  (e)^ . 

The  following  result  is  essentially  a  consequence  of  Ramsey  theorem.  It  shows 
that  a  finite  cj-semigroup  and  a  finite  Wilke  algebra  are  essentially  the  same 
thing. 

Theorem  1.  For  any  finite  Wilke  algebra  S  -  (S+,Su>),  there  ts  a  unique  infi¬ 
nite  product  from  S+  into  Su>  making  S  an  co-semigroup  such  that  s^  =  sss  . . . 
for  all  s  in  5+ . 

For  a  proof,  see  [24]  or  [17].  In  the  sequel,  we  shall  not  distinguish  between 
finite  Wilke  algebras  and  finite  cj-semigroups. 

We  say  that  a  morphism  p  :  S  from  onto  an  a;-semigroup  S'  = 

(S+,  Sa>)  recognizes  an  a;-set  X  C  A^  if  X  =  p~^(P)  for  some  P  C  S^.. 

The  following  result  extends  the  classical  concept  of  recognition  by  a  finite 
semigroup  for  a  rational  set  to  cj-rational  sets.  The  theorem  can  really  be  credited 
to  Biichi  since  he  had  the  original  idea  of  introducing  congruences  of  finite  index 
to  define  rational  cj-sets.  For  a  proof,  see  [17]. 

Theorem  2.  A  set  X  C  A’^  is  co-rational  iff  there  exists  an  u-semigroup  mor¬ 
phism  from  A^  =  {A'^,A^)  onto  a  finite  u-semigroup  S  =  recogniz¬ 

ing  X . 

The  notion  of  an  a;-semigroup  has  been  extended  by  Nicolas  Bedon  to  count¬ 
able  ordinals  in  the  sense  that  w-words  a  replaced  by  words  indexed  by  a  count¬ 
able  ordinal  [3].  This  generalization  has  the  advantage  to  give  a  more  uniform 
structure:  the  operations  are  defined  everywhere. 

3  Chains  and  superchains 

In  this  section,  we  introduce  the  notions  of  chains  and  superchains  in  automata 
and  in  (x;-semigroups. 


3.1  Chains  and  superchains  in  Muller  automata 

We  recall  that  a  Muller  automaton  is  a  deterministic  finite  automaton  A  — 
(Q,  E,  T)  where  Q  is  the  state  set,  jF  C  Q  x  yl  x  Q  is  the  set  of  transitions  and 
?:  G  (5  is  the  initial  state.  The  table  T  C  2^  is  the  set  of  accepting  subsets  of  Q. 
We  moreover  suppose  a  Muller  automaton  to  be  complete:  for  each  state  q  G  Q 
and  each  symbol  a  G  A,  there  is  a  transition  from  q  labeled  by  a.  A  set  R  C  Q 
is  called  positive  ii  RgT  and  negative  otherwise. 
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A  subset  T”  of  Q  is  said  to  be  admissible  if  there  is  a  cycle  c  in  accessible 
from  the  initial  state  z,  such  that  the  set  of  states  encountered  on  c  is  exactly  T. 
We  say  that  T  is  the  content  of  c. 

Let  A  —  (Q,  E,  i,  T)  be  a  complete  Muller  automaton.  An  A-chain  of  length  m 
is  an  increasing  sequence 


RoC  RiC  ■■■C  Rm 

of  77Z+  1  admissible  subsets  of  Q  such  that,  for  0  <  i  <  m,  the  Ri  are  alternately 
in  T  and  outside  T. 

We  say  that  the  chain  is  positive  if  i?o  G  T  and  negative  if  Rq  ^  T.  We 
denote  by  (resp.  m~(A:))  the  maximal  length  of  positive  (resp.  negative) 

^-chains  and  we  let  m[A)  —  max(777'^  (M),  ??7~  (^)).  It  is  obvious  by  the  defini¬ 
tion  that  ?77(M)  is  finite  for  any  finite  Muller  automaton  A.  One  indeed  has  the 
inequality  77?.(^)  <  card((5). 

Wilke  and  Yoo  have  shown  in  [25]  that  m[A)  can  be  computed  in  polynomial 
time.  This  contrasts  with  the  fact  the  computation  of  m(X)  for  an  w-rational 
set  A'  given  by  deterministic  Rabin  (or  Street!)  automata  is  NP-complete  [11]. 

Example  1.  Consider  the  set  AT  {a*b)'^  of  cu- words  over  {a,  6}  which  have  an 
infinite  number  of  symbols  b.  This  set  X  is  recognized  by  the  automaton  Ai 
represented  in  Figure  1  with  T  —  {{2},  {1,2}}.  The  sequence  ({1},{1,2})  is  a 
negative  chain  of  length  1.  There  are  no  positive  chains  of  length  1  and  thus 
777  =  rn~  =  1. 


c©c 


b 


Fig.  1.  Automaton  Ai. 


An  A- superchain  of  length  n  is  a  sequence 

Co,Ci,  .  .  .  ,Cn 

of  77  +  1  M-chains  of  length  m{A)  such  that: 

(i)  Each  Ci  is  accessible  from  Ci^i  for  1  <  z  <  77,  be.,  there  exists  a  path  from 
some  state  in  C\-_i  to  some  state  in  Ci, 

(ii)  The  M-chains  Ci  are  alternately  positive  and  negative. 
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We  say  that  the  superchain  is  positive  if  Co  is  positive  and  negaHve  other¬ 
wise.  We  denote  by  71+ (^)  (resp.  n-{A))  the  maximal  length  of  positive  (resp 
negative)  superchains  and  n{A)  —  max(n+(^),7i  (^))-  We  let  n  [A)  — 
(resp.  71“  (^)  =  -1)  if  the  set  of  positive  (resp.  negative)  superchains  is  empty. 
It  is  obvious  by  definition  that  n{A)  is  finite  for  any  finite  Muller  automaton  A, 
One  indeed  has  the  inequality  7i(^)  <  card(O). 


b  b  a,b 


Fig.  2.  Automaton  A2- 


Example  2.  Consider  the  set  ^  -  b*ab^.  It  is  recognized  by  the  Muller  automa¬ 
ton  A2  of  Figure  2  with  T  {{2}}.  All  chains  are  of  length  0  and  m  =  in  = 
m-  =  0.  The  sequence  ({!},  {2},  {3})  is  a  negative  superchain  of  length  2.  One 

has  71  -  71“  -2  and  71+  =  1. 

3.2  Chains  and  super  chains  in  oj-semigroups 

Let  5  -  (5+,  5a;)  be  an  u;-semigroup  and  let  A  be  a  subset  of  5^^.  Let  C  -  (F,  Z) 
be  a  pair  where  Y  is  a  non  empty  subset  of  5+  and  Z  —  zq,  zi, .  .  . ,  Zm  is  a 
sequence  of  7?i  1  elements  of  .  Let 

Zi  —  zo  +  +  •  •  ■  + 

Wi  =  Yz:„(z*zir  (1) 


for  0  <  1  <  771.  1  .  1  1  1 

We  say  that  the  pair  C  is  an  X-chain  iff  the  sets  W,  are  alternately  included 

in  X  and  disjoint  from  X. 

The  number  771  is  called  the  length  of  the  chain  C.  It  is  important  to  observe 
that  771  is  the  number  of  alternations  in  the  sequence  Wo, .  . Wm  rather  than 
the  length  of  the  sequence  in  the  usual  sense  which  would  be  m  -f  1. 

We  distinguish,  among  chains,  positive  and  negative  ones  according  to  the 
nature  of  the  first  element.  A  positive  chain  is  one  such  that  Wo  C  A  and  a 
negative  one  such  that  Wo  H  A  :=  0.  Two  positive  (resp.  negative)  chains  are  said 
to  be  of  the  same  sign. 

We  denote  by  m+  (X)  (resp,  m"  (X))  the  maximal  length  of  the  positive  (resp. 
negative)  X-chains  and  m(X)  =  max(m+(X),  m- (X)).  We  set  to+(X)  =  -1 
(resp.  m-(X)  =  -1)  if  the  set  of  positive  (resp.  negative)  chains  is  empty  and 


23 


772+ (A')  =  772“  (.Y)  =  CX)  if  the  lengths  of  A'-chains  are  unbounded.  We  shall  see 
that  772(A")  is  always  finite  for  an  ct;-rational  set  X. 

We  now  come  to  the  definition  of  a  superchain  in  an  iu;-semigroup. 

Let  S  =  (S’+,5a0  be  an  a;-semigroup  and  let  X  be  a  subset  of  S^j.  An  A- 
siiperchain  of  length  n  is  a  sequence 

of  72  +  1  A'-chains  Ci  =  {Yi,Zi),  all  of  maximal  length  m  -  772(A)  such  that, 
with  Zj  =  Zio.Zii, ,  Zim,  we  have; 

(i)  Each  Ci  is  accessible  from  Ci-i  for  1  <  i  <  72,  he.,  there  is  an  element 
Hi  E  6’+  such  that  Yi-iZ*_iUi  C  Yi. 

(ii)  The  chains  Ci  are  alternately  positive  and  negative. 

We  say  that  the  superchain  is  positive  if  Co  is  positive  and  negative  other¬ 
wise.  We  denote  by  r2+(A)  (resp.  72“  (A))  the  maximal  length  of  positive  (resp. 
negative)  superchains  and  72(A)  =  max(72+ (A),  72“  (A)).  We  let  72+ (A)  =  —  1 
(resp.  72“  (A)  =  -1)  if  the  set  of  positive  (resp.  negative)  superchains  is  empty. 
We  shall  see  that  72.(A)  is  also  finite  if  A  is  w-rational. 

3.3  Correspondence  between  the  definitions 

We  now  come  to  the  fact  that  the  definitions  of  a  chain  in  automata  and  in 
cJ-semigroups  are  in  correspondence.  This  has  two  main  consequences:  first  it 
shows  that  the  integers  772(A)  are  finite  and  computable  for  any  w-regular  set  yY. 
Second,  it  shows  that  the  integers  m[A)  do  not  depend  on  the  automaton  but 
only  on  the  set  recognized.  We  have  the  following  theorem. 

Theorem  3.  Let  X  C  be  an  lo-rational  set  recognized  by  a  complete  Muller 
automaton  A  —  (Q,  E,  2,  T).  The  following  equalities  hold: 

m+(A)  =  772+(A)  and  m~{X)~m~{A). 

Let  (p  :  S  ^  S'  be  a  morphism  from  an  u;-semigroup  S  =  {S+,Sui)  onto  an  u- 
semigroup  S'  —  {S'^,  5^).  Let  A  C  and  X'  C  S'^^  be  such  that  X  —  p 

The  image  (Y',Z')  of  an  A-chain  (Y,  Z)  is  an  A'-chain  of  the  same  length 
and  sign  and  each  A'-chain  is  the  image  of  an  A-chain  of  the  same  length  and 
Sign. 

Thus  chains  can  be  computed  in  any  a;-semigroup  recognizing  A,  in  partic¬ 
ular  in  a  finite  cj-semigroup  when  A  is  cj-rational.  We  will  see  in  Section  6  that 
chains  in  finite  w-semigroups  can  be  defined  differently. 

We  now  come  to  the  fact  that  the  definitions  of  a  superchain  in  automata 
and  in  a;-semigroups  are  also  in  correspondence.  As  in  the  case  of  chains,  this 
has  two  main  consequences:  first  it  shows  that  the  integers  72(A)  are  finite  and 
computable  for  any  a;-regular  set  A.  Second,  it  shows  that  the  integers  72(A) 
do  not  depend  on  the  automaton  but  only  on  the  set  recognized.  We  have  the 
following  theorem. 
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Theorem  4.  Let  X  C  he  an  u-rational  set  recognized  by  a  complete  Muller 
automaton  A  =  {Q,  E,i,T).  The  following  equalities  hold: 

n+iX)=  n+  {A)  and  n~  {X)  =  n'  {A) . 

4  Wagner’s  hierarchy 

To  a  Muller  automaton  A,  one  associates  another  Muller  automaton  called  the 
derived  automaton  and  denoted  dA.  It  is  nonempty  only  when  =  n”.  It  is 
then  obtained  from  A  by  the  following  transformation: 

1.  All  states  that  belong  to  a  maximal  positive  superchain  are  collapsed  into  a 

single  state  and  the  set  is  positive. 

2.  All  states  that  belong  to  a  negative  superchain  are  collapsed  into  a  single 
state  a  and  the  set  {?-}  is  negative. 

It  was  shown  by  Klaus  Wagner  that  the  set  recognized  by  A  only  depends 
on  the  set  X  recognized  by  A  and  not  on  the  particular  Muller  automaton  used 
to  recognize  X.  It  can  therefore  be  denoted  dX. 


Fig.  3.  Automaton  ^3. 


Example  3.  Consider  the  automaton  A3  of  Figure  3  with  T  =  {{!},  {1,  2},  {3}}. 
We  have  for  As  in  =  m+  =  m“  =  1  and  n  =  n~^  =  n~  =  0.  The  derived 
automaton  A4  —  dAs  is  represented  in  Figure  4.  We  then  have  for  A4  1^  = 
711'^  —  m"  =  0,  nA  =  0  and  n  =  n~  —  1.  Since  nA  zjz  n~ ,  we  have  dA^  =  0. 

We  associate  to  an  (^-rational  set  X  two  ordinals  denoted  7(X)  and  p{X) 
which  are  defined  as  follows.  The  ordinal  7(A")  is 


r  n{X)  if  m{X)  =  0 

I  (77(y )  +  1)  otherwise 


Fig.  4.  The  derived  automaton  A4  =  dA^. 


For  example,  we  have  for  the  sets  Xi  and  X2  recognized  by  the  automata  Ai 
and  An  oi  the  previous  examples, 

'^(Xi)—uj  and  7(X2)  —  2. 

The  ordinal  ^(X)  is  then  defined  by 

^[x)  =  ^{x)^^[dx). 

The  ordinal  ^(X)  is  an  arbitrary  ordinal  <  and  moreover,  since  m(dX)  < 
m.(X)  as  soon  as  m{X)  >  1,  the  decomposition  given  by  the  definition  of  7(X) 
produces  the  Cantor  normal  form  of  the  ordinal  7(^). 

Both  ordinals  ^(X)  and  ^[X)  can  be  computed  from  any  Muller  automaton 
A  recognizing  the  cj-set  X  since  the  integers  m[A)  and  n[A)  only  depend  on  the 
set  recognized  by  A. 

For  example,  we  have  for  the  u;-set  X^  recognized  by  the  automaton  given 
above 

/^(^s)  =  +  1 

We  finally  associate  to  an  w-rational  set  X  an  information  called  its  sign  and 
denoted  sign(X).  It  is  an  element  of  the  three  elements  set  {<t,  tt}  defined  as 
follows.  We  first  have 

if  n~  >  n'^ 
if  n~  < 

if  n~  =  n'^  and  m  —  0 
otherwise 

It  is  clear  that  sign(JT)  =  cr  iff  sign(v4‘^  -  X)  =  tt  and  that  sign(Z)  =  (5  iff 
sign(y4‘^  —  X)  =  (5. 

We  introduce  a  preorder  on  the  set  R(A)  of  ^^-rational  sets  defined  by  lexi¬ 
cographically  ordering  the  pair  (^(X), sign(X))  with  the  convention  that  S  >  a 


sign(X)  =  <  ^ 

[sign((9X) 
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and  S  >  t:  [a  and  tt  being  incomparable).  The  equivalence  classes  associated 
with  the  preorder  are  denoted 

r,  G  R{A)  I  fi[X)  ^  a,sign(X)  =  a] 

Aa  =  {Xe  R{A)  I  ii[X)  ^  a,sign(X)  6} 

Ra^iXe  R(A)  I  ^i{X)  ^  a,  sign(X)  =  tt} 

For  any  ordinal  a  <  ,  the  classes  Ea  and  are  dual  of  one  another  and 

the  class  Zio,  is  ambiguous.  The  order  defined  on  cj-rational  sets  by  Wagner’s 
theorem  has  the  familiar  shape  given  by  Figure  5. 


TIq  TTi  TJuj 


Fig.  5.  The  Wagner  hierarchy 


It  may  be  useful  for  a  reader  used  to  Wagner’s  notation  to  realize  that  the 
correspondence  between  Wagner’s  notation  and  ours  is  the  following.  Our  class 
En  is  his  class  Cq,  our  An  his  Eq  and  our  77„  his  Dq.  For  m  >  1  our  class  E^^^r.  n 
is  Wagner’s  our  11  ^m  n  is  his  and  our  his  E^^ .  Moreover  if  the 

normal  form  of  the  ordinal  a  is 


a  —  .rzfc  -b  . . .  -f  .ni 
then  Ea  is  denoted  in  Wagner’s  notation 

pUk  pn2  /^ni  +  l 

The  idea  of  using  ordinals  instead  of  sequences  of  pairs  of  integers  was  suggested 
by  Jean-Pierre  Resseyre  (oral  communication). 

The  order  thus  defined  happens  to  completely  characterize  another  order 
called  the  Wadge  order  and  defined  in  general  as  follows.  Let  E,  F  be  topological 
spaces  and  let  X  C  E,  Y  C  F.  We  say  that  X  reduces  to  Y ,  written  X  <  Y  if 
there  exists  a  continuous  function  f  :  E  ^  F  such  that  X  =  f~^(Y). 

We  can  now  state  Wagner’s  main  theorem. 


Theorems.  (K.  Wagner)  Given  uj-rational  sets  X^Y ,  one  has  the  equivalence: 
X  <Y  7(X)<7(y). 
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The  statement  implies  that  for  X  E  R{A),  Y  E  R{B),  one  has  A"  <  Y  iff 
there  exists  a  function  f  :  such  that  X  =  f~^{Y)  and  which  is  not 

only  continuous  but  also  rational.  This  is  actually  the  content  of  the  theorem  of 
Biichi-Landweber  (see  [21]). 

The  main  theorem  due  to  Wadge  is  the  following:  in  a  topological  space 
like  ,  the  order  given  by  the  reduction  by  a  continuous  function  is  a  well 
ordering  [10].  Thus  the  classes  of  the  associated  equivalence  can  be  indexed  by 
ordinals.  When  restricted  to  cj-rational  classes,  the  order  type  of  the  hierarchy 
is  . 

5  Topological  classes 

We  shall  give  here  a  description  in  topological  terms  of  the  classes  of  the  hierar¬ 
chy.  It  allows  one  to  prove  Wagner’s  theorem  in  one  direction  since  the  topological 
characterization  gives  a  definition  of  the  classes  invariant  under  the  inverse  of  a 
continuous  function.  It  is  convenient  to  denote,  for  an  ordinal  a  < 

=  [J  Xp 

f3<a 


and  correspondingly  for  /!<«  and  A<a- 

We  shall  see  that  the  classes  of  the  Wagner  hierarchy  can  be  described  us¬ 
ing  differences,  separated  unions  and  biseparated  unions,  starting  from  simple 
topological  sets.  We  first  describe  the  simple  classes  which  happen  to  be  classical 
classes  of  the  Borel  hierarchy. 


5.1  Simple  classes 

The  first  kind  is  the  class  of  open  sets.  We  shall  denote  here  by  G  the  class  of 
open  sets,  rational  or  not  (and  not  by  Xi  as  it  is  sometimes  done  in  topology). 
The  following  statement  uses  a  special  form  of  Biichi  automata  called  weak:  a 
path  is  successful  if  it  contains  at  least  one  terminal  state. 

Theorem  6.  The  following  conditions  are  equivalent  for  an  u-rational  set  X . 

w  A-  Gr<,. 

(li)  X  is  open. 

(lit)  X  <  a*b(a  T  bY 

(iv)  X  IS  recognizable  by  a  weak  deterministic  Biichi  automaton. 

Condition  (i)  can  be  formulated  as  follows:  for  all  x,  y,zA  ^  A'^ 
xy^  E  xy*  zt^  H  X  7^  0 

which  precisely  expresses  that  m(X)  =  0  and  <  0.  We  shall  see  later  that 
this  condition  can  formulated  using  an  inequality  in  ordered  cj-semigroups. 
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The  second  class  is  the  class  of  sets  which  are  countable  intersections  of 
open  sets.  We  denote  this  class  by  Gs  (and  not  by  772  it  is  done  sometimes 
in  topology,  since  it  would  contradict  our  use  of  this  notation).  Similarly,  we 
denote  by  the  class  of  countable  unions  of  open  sets.  The  following  result  is 
originally  due  to  H.  Landweber  [12]. 

Theorem  7.  The  following  conditions  are  equivalent  for  an  u-rational  set  X. 

(i)  X  e 

(n)  X  e  Gs. 

(vii)  X  <  [any 

(iv)  X  is  recognizable  by  a  deterministic  Buchi  automaton. 

The  equivalence  between  (ii)  and  (iii)  is  a  general  fact  of  descriptive  set  the¬ 
ory,  independent  of  the  hypothesis  that  X  is  w-rational.  A  convenient  way  to 
prove  the  implications  is  (i)  (iv)  (iii)  (ii)  =>  (i).  The  first  one  is  proved 
using  a  well-known  construction  building  a  deterministic  Biichi  automaton  from 
a  Muller  automaton  satisfying  m+  <  0.  The  last  one  can  be  done  by  reformu¬ 
lating  condition  (i)  as  follows:  for  all  x,y,z  ^ 

x{y  +  zyy^  CX^x{y*zrr\Xi^% 

which  expresses  precisely  that  m'^{X)  <  0. 


5.2  Boolean  combinations  of  open  sets 

In  order  to  describe  the  boolean  combinations  of  open  sets,  we  introduce  the 
notion  of  a  difference  of  sets.  Let  7^  be  a  class  of  sets.  We  denote  by  Dn{F)  the 
class  of  sets  X  of  the  form 


A  =  -  As  +  . . .  ± 

where  the  sets  Af  satisfy  A^  G  T  and  Ai  D  A2  D  .  .  .  D  A„.  Such  an  expression 
of  A'  is  called  a  difference  of  length  n.  According  to  a  theorem  of  Hausdorff,  if  F 
is  closed  under  finite  unions  and  intersections  and  contains  the  empty  set,  the 
union  of  all  the  classes  Dn{F)  for  n  >  1  is  the  boolean  closure  of  F.  This  means 
that  any  set  in  the  boolean  closure  of  F  is  equal  to  a  difference  of  sets  of  F.  The 
classes  Dn{r)  define  a  hierarchy  within  the  boolean  closure  of  7^.  As  we  shall 
see,  it  turns  out  that,  when  F  is  the  class  17<i  of  w-rational  open  sets  or  when 
F  is  the  class  of  a;-rational  G§  sets,  the  classes  Dn(F)  coincide  with  classes 
of  the  Wagner  hierarchy. 

We  consider  here  the  classes  A„,  7e.,  the  classes  of  sets  A  such  that  7(A)  <  u 
or  equivalently  such  that  m(A)  =  0.  It  is  actually  equivalent  to  assume  on  a  con¬ 
nected  Muller  automaton  A  =  (0,  E,  z,  T)  that  m{A)  =  0  or  that  each  strongly 
connected  component  77  of  .4  is  saturated  in  the  sense  that  5*  G  7”  for  all  admis¬ 
sible  sets  5  C  77  or  for  none  of  them.  Such  an  automaton  is  clearly  equivalent 
to  one  of  the  following  kind,  that  we  propose  to  call  a  weak  Muller  automaton. 
It  is  a  finite  automaton  A  =  (Q,  A,  z,T)  with  a  definition  of  a  successful  path 


29 


given  by  the  following  rule:  a  path  7  is  successful  if  the  set  of  states  met  along  7 
is  in  T. 

The  following  result  is  originally  due  to  Staiger  and  Wagner  [20].  It  means 
that  an  cj-ratioiial  set  X  belongs  to  the  class  !;<„  iff  it  is  equal  to  a  difference 
of  length  n  of  U/'-rational  open  sets. 

TlieoremS.  One  has  for  all  n  <  uj 

E<n^Dn{S<l) 

Moreover, 

n<LO 

and  coincides  with  the  boolean  closure  of  the  family  of  rational  open  sets. 

In  the  second  equality,  the  inclusion  from  right  to  left  is  obvious  since  each 
Tn  is  contained  in  and  in  77<^.  The  converse  is  also  evident  since  a  set 

X  ^  X'<a>  n  /7<a'  satisfies  7n'^(A  )  <  0  and  m  (A)  <  0  and  therefore  m(A)  —  0. 

Theorem  8~ is  really  a  counterpart  for  rational  sets  of  a  theorem  of  Hausdorff 
according  to  which,  one  has  in  a  topological  space  such  as  A. 

F^nGs^  U 

a<uji 

where  the  union  is  on  all  countable  ordinals  (see  [10]  for  example). 

5.3  Separated  classes  and  boolean  combinations  of  G^-sets 

In  this  section,  we  describe  the  classes  for  o  =  .n.  We  first  consider  the 

case  of  tv  -  The  following  result  is  originally  due  to  K.  Wagner  [22]. 

Proposition 9.  For  all  in  <  u),  we  have  the  equality 

Dm{^<  U)  ) 

We  now  introduce  the  notion  of  a  separated  union.  Let  Ai,  A2,y  C  be 
three  cj-sets.  Suppose  furthermore  that  the  three  sets  satisfy  Ai  fl  T  =  0  and 
A'2  C  y .  Following  a  notation  borrowed  to  Alain  Louveau  [14],  let  us  denote  by 
Sep(y,  Ah ,  Ao)  the  union 

A  -  Ai  +  A2 

The  picture  is  shown  in  Figure  6. 

We  say  that  A^  is  the  separated  union  of  Xy  and  A2  or  that  X  is  the  union  of 
Ai  and  A2  separated  by  Y  (we  actually  exchange  Ai  and  A2  in  the  notation  of 
[14]).  We  also  define,  for  two  classes  T,  of  w-sets,  a  new  class  Sep(r,  A)  as  the 
class  of  all  sets  of  the  form  A  =  Sep(y,  Ai,  A2)  for  Y  G  T,  Ai  G  A  and  A2  G  A. 

The  following  result  gives  a  topological  description  of  the  classes  If 

is  analogous  to  a  statement  given  in  [22]. 

Theorem  10.  For  each  m  >  1  and  n>2,  one  has 

=  Sep(Dn-i(G),X<u>^) 

and  dually 

n<uj-rn  =Sep{Dn-\{G),n<^n.). 


Fig.  6.  Separated  union  of  Xi  and  X2. 


5.4  Biseparated  classes 

We  now  relate  the  definition  of  the  set  dX  with  the  topological  structure  of  X. 
We  borrow  again  a  notation  from  Alain  Louveau  [14]  and  introduce  the  no¬ 
tion  of  biseparated  union.  Let  Xi,  X2,yi,Y2  and  X  be  five  cj-sets  satisfying 
Ai  C  Vi,  X2  C  ¥2,  Yi  nVs  =  0,  Z  DYi  =  0  and  Z  n  ¥2  =  0.  Let  us  denote 
Bisep(yi ,Y2,Xi,X2,Z)  the  union 

X  =  Xi+X2-hZ 

The  picture  is  shown  in  Figure  7.  We  say  that  X  is  the  biseparated  union  of  Ai, 
X2  and 


Fig.  7.  Biseparated  union  of  Ai,  X2  and  Z. 


A  are  three  classes  of  u;-sets,  we  denote  by 
Bisep(^,  r,  A) 

the  class  of  sets  A  =  Bisep(yi ,  y2,  Ai ,  A2,  A)  with  Yi,Y2  G  Ai  G  T,  A2  G  T 
and  Z  ^  A. 

The  following  result  expresses  that  the  elements  of  the  class  are  the 

unions  of  sets  of  the  same  kind  (but  with  opposite  signs)  separated  by  disjoint 
open  sets  plus  some  set  of  lower  class  of  the  same  class. 

Theorem  11.  For  all  m>  I  and  n>l  and  (d  <  ,  one  has 


Aa;’^.n+/?  —  Bisep(G,  A/j) 

A(jjfn  —  Bisep(G,  A^^m 

. n 4- /?  —  Bisep(G,  ITp'j, 
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6  Finite  cj-semigroups 

The  definition  of  chains  an  siiperchains  in  finite  cc;-semigroups  uses  the  Green’s 
relations  %  and  7^  defined  as  follows.  For  elements  s,  /  of  a  semigroup  5,  one  has 
5  >7?  f.  if  5  =  /  or  t^sS2inds>nt\is^ioite  sS  and  t  G  5s.  The  relation 
>  jz  is  preorder  and  the  restriction  of  >7^  to  the  idempotents  also. 

In  the  case  of  a  subset  X  of  a  finite  (^-semigroup  5  =  (5+,  5a,),  the  definition 
of  a  chain  relative  to  X  can  be  used  in  the  following  form.  It  is  a  sequence 
(s,  eo,  Cl,  .  .  . ,  e-in)  of  elements  of  5+  such  that: 

(i)  For  0  <  ?'  <  m,  the  pair  (s,  Cf)  is  linked,  i.e.,  sti  =  s  and  ef  =  ej. 

(ii)  The  sec|uence  of  idempotents  eo,  ci,  •  •  • ,  decreasing  for  the  Ti  order. 

(iii)  The  elements  se^  are  alternately  in  X  and  outside  of  X. 

We  have  again  the  notion  of  a  positive  or  negative  chain  according  to  seo  G  X  or 
not.  The  definition  of  a  chain  in  a  finite  cj-semigroup  coincides  with  the  definition 
of  a  chain  we  gave  in  a  general  cj-semigroup  in  the  following  sense.  To  any  chain 
for  the  former  definition,  can  be  associated  another  chain  for  the  latter  one  with 
the  same  length  and  same  sign,  and  vice  versa.  The  integers  m'*'(X)  and  in  (X) 
do  not  depend  on  the  definition  of  a  chain  considered. 

The  notion  of  a  superchain  is  also  adapted  to  the  case  of  a  finite  cj-semigroup 
to  be  defined  as  a  sequence  t/o,  ni, .  .  . ,  Un  of  chains  Ui  =  {si,  Cio,  e^i, .  .  . ,  eim)  of 
length  777  such  that: 

(i)  The  sequence  Si  is  decreasing  for  the  1Z  order,  i.e. 

So  Sn- 

(ii)  The  chains  Ui  are  alternately  positive  and  negative. 

As  for  chains,  the  definition  of  a  superchain  in  a  finite  c^-semigroup  is  equiv¬ 
alent  to  the  definition  of  superchain  we  gave  in  a  general  a;-semigroup. 

The  definition  of  chains  and  superchains  on  finite  a;-semigroups  allows  one 
to  give  a  characterization  of  the  classes  of  Wagner’s  hierarchy.  It  would  be  in¬ 
teresting  to  extend  these  ideas  to  classes  defined  for  finite  words. 


7  Ordered  cj-semigroups 

An  ordered  10- semigroup  is  an  u;-semigroup  5  (5+,5a,)  with  a  partial  order 

on  each  of  the  sets  5+  and  S^o  which  are  compatible  with  all  operations:  for  all 
s,  /,  u,  V  G  5 

5  <  ^  =>  usv  <  utv, 
s  <  t,u  <  V  ^  su"^  <  tv^ 

A  morphism  </?  :  5  — )■  T  of  ordered  u;-semigroup  is  a  morphism  of  Lj-semigroups 
which  is  also  compatible  with  the  orders:  for  all  s,  t  G  5,  s,  /  G  5  and  s  <t  imply 
V?(.s)  < 
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It  has  been  shown  by  Jean-Eric  Pin  [18]  that  any  cj-rational  set  has  a  finite 
syntactic  ordered  u- semigroup.  The  context  of  finite  a  word  v  with  respect  to  an 
a;-set  A'  C  A^'  is  the  the  pair  of  sets  C[u)  ~  {Ci(u),C2[u))  where  Ci{u)  and 
C2(u)  are  respectively  defined  by 

C\  (u)  ~  {(v^x)  E  A*  X  \  vux  E  X} 

C2{u)  =  E  A*  X  A*  \  v{uw)*^  E  X}. 

In  the  same  way,  the  context  of  an  C(;-word  x  with  respect  to  the  u;-set  X  C  A^ 
is  the  set 

C{x)  =  {w  E  A*  I  uic  G  X}. 

It  is  well  known  that  if  5  =  5^^)  is  the  syntactic  cu-semigroup  of  X,  the 

elements  of  5+  (resp.  5^)  correspond  to  contexts  of  finite  words  (resp.  cj-words). 
More  precisely  two  finite  words  ii  and  u'  (resp.  two  Lt;-words  x  and  x')  have  the 
same  image  in  the  syntactic  cj-semigroup  iff  they  have  the  same  context.  This 
allow  one  to  define  the  context  of  an  element  of  5.  Contexts  could  also  have  been 
directly  defined  in  S  with  respect  to  the  image  P  of  X  in  5^; .  An  order  can  be 
defined  in  S  by 


s  <  t  iff  C(s)  C  C{t) 

This  order  is  compatible  with  the  operation  of  S.  The  a;-semigroup  S  equipped 
with  this  order  is  then  an  ordered  cj-semigroup.  It  is  in  fact  the  syntactic  ordered 
u;-semigroup  of  X . 

In  a  finite  semigroup,  we  denote  the  unique  idempotent  which  is  a  power  of  s 
by  instead  of  the  usual  notation  since  the  symbol  oj  has  another  meaning 
here. 

The  following  statement  gives  a  characterization  of  open  sets  alternative  to 
Theorem  6. 

Theorem  12.  An  uj-rational  set  X  is  open  iff  its  syntactic  ordered  uo- semigroup 
satisfies  the  following  identity 

x^  <  x^yz^ 

The  following  result  gives  a  syntactic  characterization  of  the  class 

Theorem  13.  An  u-rational  set  X  is  in  iff  its  syntactic  ordered  uj- semigroup 
satisfies  the  following  identity 

{x^yYx^  <  {x^yff 

As  a  consequence,  we  obtain  the  following  syntactic  characterization,  due  to 
Thomas  Wilke  [24],  of  the  sets  in  fl  which  are  also  the  boolean  com¬ 
binations  of  open  sets  by  Theorem  7. 

Theorem  14.  An  lj -rational  set  is  a  boolean  combination  of  open  sets  iff  its 
syntactic  u)~ semigroup  satisfies  the  identity 


(x^yYx'^  =  (x^vY 
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Actually,  the  identity  given  in  [24]  is  the  identity 

which  can  be  shown  to  be  equivalent  to  the  previous  one. 

Conclusion 

It  would  be  interesting  to  investigate  further  on  in  several  directions  including 
the  followings  ones. 

7.1  A  syntactic  definition  of  the  derivative 

Klaus  Wagner  has  introduced  the  notion  of  the  derivative  dX  of  an  cj-rational 
set  yY.  It  is  defined  using  a  Muller  automaton  recognizing  X .  We  do  not  know 
how  to  define  the  derivative  in  a  finite  ^-semigroup  in  such  a  way  that  dX  can 
be  computed  in  the  syntactic  cj-semigroup  of  X. 

7.2  Biinfinite  vv^ords 

The  theory  of  a;-rational  sets  can  be  developed  for  sets  of  two-sided  infinite 
words  [16].  Such  sets  have  also  been  considered  in  symbolic  dynamics  [13].  A 
symbolic  dynamical  system  is  by  definition  a  set  of  biinfinite  words  which  is 
topologically  closed  and  invariant  under  the  shift.  Let  S  and  T  be  two  symbolic 
dynamical  systems.  A  morphism  from  S  into  T  is  a  function  f  :  S  -T  T  which 
is  continuous  and  commutes  with  the  shifts  of  S  and  T.  As  a  particular  case  of 
symbolic  dynamical  systems,  a  sofic  system  is  defined  by  a  set  of  forbidden  blocks 
recognized  by  a  finite  automaton.  As  a  still  more  restricted  class,  a  system  of 
finite  type  is  a  set  of  biinfinite  words  defined  by  a  finite  set  of  forbidden  blocks. 
If  A,  Y  are  symbolic  dynamical  systems,  it  is  natural  to  say  that  X  C 
reduces  to  Y  C  ,  denoted  X  <  Y,  if  there  exists  a  morphism  /  from 
to  such  that  X  —  f~^{Y).  One  thus  obtains  a  hierarchy  of  subsets  of  A'^ 
analogous  to  the  Wadge- Wagner  hierarchy.  The  three  classes  defined  previously 
are  precisely  preserved  by  inverse  morphisms.  It  would  be  interesting  to  know 
the  Wadge- Wagner  classes  of  symbolic  dynamical  systems. 

7.3  Finite  words 

It  is  an  open  problem  to  define  a  hierarchy  for  finite  words  analogous  to  Wagner’s 
one.  An  objective  for  such  a  classification  could  be  to  obtain  a  refinement  of  the 
characterization  of  some  well  known  classes.  For  instance,  the  classes  of  locally 
testable  sets  is  the  boolean  closure  of  the  class  of  strictly  locally  testable  ones. 
The  latter  are  finite  unions  of  sets  of  the  form  U A*  n  A*V  \  A*W A*  where 
U ,  V  and  W  are  finite  sets  of  words.  If  S  denotes  the  family  of  strictly  locally 
testable  sets,  the  family  Dn(S)  of  differences  of  length  n  of  elements  of  S  defines 
a  hierarchy  within  locally  testable  sets. 
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It  is  possible  to  define  Muller  automata  on  finite  words.  Let  indeed  A  = 
(Q,E,T)  be  a  finite  automaton  where  T  is  a  subset  of  Q  x  2^  x  Q.  A  finite 
path  7  :  7  ^  /  in  this  automaton  is  successful  if  the  set  R  of  states  met  along 
the  path  is  such  that  (f,  /?,,  t)  E  T.  The  usual  definition  of  locally  testable  sets 
actually  uses  such  automata:  they  are  the  sets  recognized  when  the  underlying 
automaton  is  the  standard  local  automaton. 

A  full  parallel  with  Wagner  hierarchy  requires  a  choice  of  a  topology  on  finite 
words.  A  possibility  would  be  to  consider  the  profinite  topology  associated  to  a 
pseudo- variety  of  semigroups  [1]. 
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“Don’t  express  your  ideas  too  clearly.  Most  people 
think  little  of  what  they  understand,  and  venerate 
what  they  do  not.” 

(The  Art  of  Worldly  Wisdom, 
Baltasar  Gracian,  1647.) 


Abstract.  We  show  how  the  constraint  propagation  process  can  be  nat¬ 
urally  explained  by  means  of  chaotic  iteration. 


1  Introduction 

1.1  Motivation 

Over  the  last  ten  years  constraint  programming  emerged  as  an  interesting  and 
viable  approach  to  programming.  In  this  approach  the  programming  process  is 
limited  to  a  generation  of  requirements  (“constraints”)  and  a  solution  of  these 
requirements  by  means  of  general  and  domain  specific  methods.  The  techniques 
useful  for  finding  solutions  to  sets  of  constraints  were  studied  for  some  twenty 
years  in  the  field  of  Constraint  Satisfaction.  One  of  the  most  important  of  them 
is  constraint  propagation.,  the  elusive  process  or  reducing  a  constraint  satisfaction 
problem  to  another  one  that  is  equivalent  but  “simpler” . 

The  algorithms  that  achieve  such  a  reduction  usually  aim  at  reaching  some 
“local  consistency” ,  which  denotes  some  property  approximating  in  some  loose 
sense  “global  consistency” ,  so  the  consistency  of  the  whole  constraint  satisfaction 
problem.  (In  fact,  most  of  the  notions  of  local  consistency  are  neither  implied 
by  nor  imply  global  consistency.) 

For  some  constraint  satisfaction  problems  such  an  enforcement  of  local  con¬ 
sistency  is  already  sufficient  for  finding  a  solution  or  for  determining  that  none 
exists.  In  some  other  cases  this  process  substantially  reduces  the  size  of  the  search 
space  which  makes  it  possible  to  solve  the  original  problem  more  efliciently  by 
means  of  some  search  algorithm. 

The  aim  of  this  paper  is  to  show  that  the  constraint  propagation  algorithms 
can  be  naturally  explained  by  means  of  chaotic  iteration,  a  basic  technique  used 


37 


for  computing  limits  of  iterations  of  finite  sets  of  functions  that  originated  from 
numerical  analysis  (see  Chazan  and  Miranker  (1969))  and  was  adapted  for  com¬ 
puter  science  needs  by  Cousot  and  Cousot  (1977).  In  fact,  several  constraint 
propagation  algorithms  proposed  in  the  literature  turn  out  to  be  instances  of 
generic  chaotic  iteration  algorithms  studied  here. 

Moreover,  by  characterizing  a  given  notion  of  a  local  consistency  as  a  common 
fixed  point  of  a  finite  set  of  monotonic  and  inflationary  functions  we  can  auto¬ 
matically  generate  an  algorithm  achieving  this  notion  of  consistency  by  “feeding” 
these  functions  into  a  generic  chaotic  iteration  algorithm. 


1.2  Preliminaries 

Definition  1.  Consider  a  sequence  of  domains  V  Di, . . 

-  By  a  scheme  (on  n)  we  mean  a  sequence  of  different  elements  from  [l..n]. 

—  We  say  that  C  is  a  constraint  (onV)  with  scheme  ii, . . .,  2/  if  C  C  Di^  x  •  •  •  x 

A.. 

-  Let  s  :=  5i,...,Sfc  be  a  sequence  of  schemes.  We  say  that  a  sequence  of 
constraints  Ci,. .  .,Ck  on  is  an  s-sequence  if  each  Ci  is  with  scheme  Si. 

—  By  a  Constraint  Satisfaction  Problem  {T>;C),  in  short  CSP,  we  mean  a  se¬ 

quence  of  domains  T>  together  with  an  s-sequence  of  constraints  C  on  T>.  We 
call  then  s  the  scheme  of  {V\C).  □ 

Given  an  n-tuple  d  :=  di , . . .,  d„  in  Di  X  . . .  X  and  a  scheme  s  ii , .  . ., 
on  n  we  denote  by  d[s]  the  tuple  dq, . . .,  di^.  In  particular,  for  j  G  [l..n]  d[j]  is 
the  j-th  element  of  d.  By  a  solution  to  a  CSP  (T>]C),  where  V  :=  Di, . . .,  Z)„, 
we  mean  an  n-tuple  d  G  x  . . .  x  such  that  for  each  constraint  C  in  C  with 
scheme  s  we  have  d[5]  G  C. 

Consider  now  a  sequence  of  schemes  Si,...,Sfc.  By  its  union^  written  as 
(si, . . .,  Sk)  we  mean  the  scheme  obtained  from  the  sequences  sj, . . .,  by  re¬ 
moving  from  each  Si  the  elements  present  in  some  Sj,  where  j  <  and  by  con¬ 
catenating  the  resulting  sequences.  For  example,  {(3, 7, 2),  (4,  3,  7,  5),  (3, 5,  8))  = 
(3,  7,  2, 4,  5,  8).  Recall  that  for  an  si, . . .,  s^-sequence  of  constraints  Ci, . . .,  A 
their  join,  written  as  Ci  N  •  ■  ■  1X1  is  defined  as  the  constraint  with  scheme 
(si, . .  .,Sk)  and  such  that 

d  G  Cl  M  •  •  •  M  A  iff  d[5i]  G  Ci  for  i  G  [l..k]. 

Further,  given  a  constraint  C  and  a  subsequence  5  of  its  scheme,  we  denote 
by  IlsiC)  the  constraint  with  scheme  s  defined  by 

i7,(C)  {d[s]  I  d  G  C}, 

and  call  it  the  projection  ofC  on  s.  In  particular,  for  a  constraint  C  with  scheme 
s  and  an  element  j  of  5,  nj{C)  —  {a  \  3d  G  C  a  =  d[j]}. 

Given  a  CSP  {V]C)  we  denote  by  Sol{{V',C))  the  set  of  all  solutions  to  it. 
If  the  domains  are  clear  from  the  context  we  drop  the  reference  to  V  and  just 
write  Sol{C).  The  following  observation  is  useful. 


38 


Note  2.  Consider  a  CSP  {V;C)  with  V  :=  Di, . . Dn  and  C  :=  Ci, . .  .,Ck  and 

with  schem.e  s. 

(i)  Sol{(V;  O)  =  Cl  N  . . .  N  Ca:  A, 

xuhere  I  ■.=  {i  e  [l..n]  |  i  does  not  appear  in  s}. 

(a)  For  every  s- sub  sequence  C  of  C  and  d  G  Sol(^{'D’,C))  we  have  d[(s)]  G  Sol(C). 

□ 

Finally,  we  call  two  CSP’s  equivalent  if  they  have  the  same  set  of  solutions. 
Note  that  we  do  not  insist  that  these  CSP’s  have  the  same  sequence  of  domains 
or  the  same  scheme. 


2  Chaotic  Iterations 

As  already  mentioned  in  the  introduction,  one  of  the  corner  stones  of  constraint 
programming  is  constraint  propagation.  In  general,  two  basic  approaches  fall 
under  this  name: 

-  reduce  the  domains  while  maintaining  equivalence; 

-  reduce  the  constraints  while  maintaining  equivalence. 

In  what  follows  we  study  these  two  processes  in  full  generality. 


2.1  Chaotic  Iterations  on  Simple  Domains 

In  general,  chaotic  iterations  are  defined  for  functions  that  are  projections  on 
individual  components  of  a  specific  function  with  several  arguments.  In  our  ap¬ 
proach  we  study  a  more  elementary  situation  in  which  the  functions  are  unrelated 
but  satisfy  certain  properties.  These  functions  are  defined  on  specific  partial  or¬ 
ders.  We  need  the  following  concepts. 

Definition  3.  We  call  a  partial  order  (D,  C  )  an  U-po  if 

-  D  contains  the  least  element,  denoted  by  X, 

-  for  every  increasing  sequence 

do  C  C  (i2  .  ■ . 

of  elements  from  D,  the  least  upper  bound  of  the  set 

•[do  7  5  ^2-)  •  • 

denoted  by  U^o  called  the  limit  of  do,  di, . . .,  exists, 

-  for  all  a,b  G  D  the  least  upper  bound  of  the  set  {a,  6},  denoted  by  a  U  6, 
exists. 


Further,  we  say  that 
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-  an  increasing  sequence  do  □  di  C  (^2  •  •  •  eventually  stabilizes  at  d  if  for 
some  j  >  0  we  have  di  =  d  for  i  '>  j, 

-  a  partial  order  satisfies  the  finite  chain  property  if  every  increasing  sequence 

of  its  elements  eventually  stabilizes.  □ 

Definition  4.  Consider  a  set  D,  an  element  d  e  D  and  a  set  of  functions  F  := 
on  D. 

-  By  a  run  (of  the  functions  /i, . . /fc)  we  mean  an  infinite  sequence  of  num¬ 
bers  from  [l.,A:], 

-  A  run  ^1,^2,...  is  called  fair  if  every  i  G  appears  in  it  infinitely  often. 

-  By  an  iteration  of  F  associated  with  a  run  ii,^2,  •  •  •  cL^id  starting  with  d  we 
mean  an  infinite  sequence  of  values  do,  di, . . .  defined  inductively  by 

do  • —  d, 

dj  :=  fij  (dj_i). 

When  d  is  the  least  element  of  D  in  some  partial  order  clear  from  the  context, 
we  drop  the  reference  to  d  and  talk  about  an  iteration  of  F. 

-  An  iteration  of  F  is  called  chaotic  if  it  is  associated  with  a  fair  run.  □ 

Definitions.  Consider  a  partial  order  (D,  ^  ).  A  function  /  on  D  is  called 

-  inflationary  if  x  C  f{x)  for  all  x, 

-  monotonic  if  x  C  j/  implies  /(x)  C  f{y)  for  all  x,y, 

-  idempotent  if  /(/(x))  =  /(x)  for  all  x. 

□ 

The  following  observation  can  be  easily  distilled  from  a  more  general  result 
due  to  Cousot  and  Cousot  (1977).  To  keep  the  paper  self-contained  we  provide 
a  direct  proof. 

Theorem  6  (Chaotic  Iteration).  Consider  an  \J-po  (D,  C  )  and  a  set  of  func¬ 
tions  F  :=  {/i, . .  .,//c}  on  D.  Suppose  that  all  functions  in  F  are  inflationary 
and  monotonic.  Then  the  limit  of  every  chaotic  iteration  of  F  exists  and  coin¬ 
cides  with 

oo 

U  /  ti. 

3=0 

where  the  function  f  on  D  is  defined  by: 


k 

f{x)  :=  lJ/i(a:) 

i=l 


and  /  ti  0,"^  abbreviation  for  f^{X.),  the  j-th  fold  iteration  of  f  started  at  J_. 
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Proof.  First  notice  that  /  is  inflationary,  so  /  t  J  exists.  Fix  a  chaotic 

iteration  •  •  •  of  F  associated  with  a  fair  run  —  Since  all  functions 

fi  are  inflationary,  [_\Y=o  ^3  exists.  The  result  follows  directly  from  the  following 
two  claims. 

Claim  1  Vj  3m  /  t  i  E  dm- 
Proof.  We  proceed  by  induction  on  j. 

Base,  j  =  0.  As  f  f  0  —  ±  =  do,  the  claim  is  obvious. 

Induction  step.  Assume  that  for  some  j  >  0  we  have  /  t  J  E  dm  for  some 
m  >  0.  Since 

k 

/t(i  +  i)  =  /(/tj)  = 

i=l 

it  suffices  to  prove 

Vie[l..fe]3mi/i(/ti)  (1) 

Indeed,  we  have  then  by  the  fact  that  di  C  ci;+i  for  I  >  0 

k  k 

2=1  2=1 

where  m'  :=  max{mi  |  i  £  [1..A:]}. 

So  fix  i  e  [l..k].  By  fairness  of  the  considered  run  ^1,^2,  •  •  •,  for  some  mi  >  m 
we  have  =  i.  Then  =  /2(d^,-i).  Now  dm  C  d^,_i,  so  by  the  monotonic¬ 
ity  of  fi  we  have 

fiif  f  j)  ^  fi{dm)  ^  fi{dmi—l)  —  dmi- 

This  proves  (1).  ^ 

Claim  2  Vm  dm  Q  f  f 

Proof.  The  proof  is  by  a  straightforward  induction  on  m.  Indeed,  for  m  —  0  we 
have  do  =  ±  =  f  f  0,  so  the  induction  base  holds. 

To  prove  the  induction  step  suppose  that  for  some  m  >  0  we  have  dm  Q  f  f 

m.  For  some  i  £  [l..k]  we  have  d^+i  =  fi{dm),  so  by  the  monotonicity  of  /  we 

get 

dm+i  =  fi{dm]  E  f{dm)  E  /(/  t  m)  =  /  t  (m  +  1). 

□ 

□ 


In  many  situations  some  chaotic  iteration  studied  in  the  Chaotic  Iteration 
Theorem  6  eventually  stabilizes.  This  is  for  example  the  case  when  (F,  C  ) 
satisfies  the  finite  chain  property.  In  such  cases  the  limit  of  every  chaotic  iteration 
can  be  characterized  in  an  alternative  way. 
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Corollary?  (Chaotic  Iteration).  Suppose  that  under  the  assumptions  of  the 
Chaotic  Iteration  Theorem  6  some  chaotic  iteration  of  F  eventually  stabilizes. 
Then  every  chaotic  iteration  of  F  eventually  stabilizes  at  the  least  fixed  point  of 

/■ 

Proof.  It  suffices  to  note  that  if  some  chaotic  iteration  c?o,  -  •  •  of  F  eventually 
stabilizes  at  some  dm  then  by  Claims  1  and  2  f  m  =  dm^  so 

oo 

LJ/ti  =  /tm-  (2) 

Then,  again  by  Claims  1  and  2,  every  chaotic  iteration  of  F  stabilizes  at  /  t 
and  it  is  easy  to  see  that  by  virtue  of  (2)  f  m  is  the  least  fixed  point  of  /.  □ 


2.2  Chaotic  Iterations  on  Compound  Domains 


Not  much  more  can  be  deduced  about  the  process  of  the  chaotic  iteration  unless 
the  structure  of  the  domain  D  is  further  known.  So  assume  now  that  (F,  C  ) 
is  the  Cartesian  product  of  the  U-po’s  (F^,  C  j),  for  i  G  [l..n],  defined  in  the 
expected  way.  It  is  straightforward  to  check  that  (F,  C  )  is  then  an  U-po,  as  well. 
In  what  follows  we  consider  a  modification  of  the  situation  studied  in  the  Chaotic 
Iteration  Theorem  6  in  which  each  function  fi  affects  only  certain  components 
of  F. 

Consider  the  partial  orders  (Fj,  C  j),  for  i  G  [l..n]  and  a  scheme  s  := 
ii, . . .,  ii  on  n.  Then  by  (F^,  C  g)  we  mean  the  Cartesian  product  of  the  partial 
orders  (F^^.,  C  i.),  for  j  G  [1../]. 

Given  a  function  /  on  Dg  we  say  that  /  is  with  scheme  s.  Instead  of  defining 
iterations  for  the  case  of  the  functions  with  schemes,  we  rather  reduce  the  situ¬ 
ation  to  the  one  studied  in  the  previous  subsection.  To  this  end  we  canonically 
extend  each  function  /  on  Dg  to  a  function  f^  on  F  as  follows.  Suppose  that 
s  =  , . . .,  z/  and 

/  (dii  ^  5  ■  •  -5  )• 


Let  for  j  G  [l..n] 


{e'j  if  j  is  an  element  of  s, 
dj  otherwise. 


Then  we  set 

/"^ (di , .  • dji)  '.=  (ci , . . e„) . 


Suppose  now  that  (F,  C  )  is  the  Cartesian  product  of  the  U-po’s  (Fj,  U  ^), 
for  i  G  [l..n],  and  F  ;==  {/i,  •  •  •,  /fc}  is  a  set  of  functions  with  schemes  that  are 
all  inflationary  and  monotonic.  Then  the  following  algorithm  can  be  used  to 
compute  the  limit  of  the  chaotic  iterations  of  F“^  {ft ^  ■  -i  fk}-  We  say  here 
that  a  function  /  depends  on  z  if  z  is  an  element  of  its  scheme. 
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Generic  Chaotic  Iteration  Algorithm  (Cl) 

n  times 
d'  :=  d- 
F; 

while  G  /  0  do 

choose  g  eG\  suppose  g  is  with  scheme  s; 

G:=:G-{5}; 
d'[s]  :=  g((^[s]); 
if  d[s]  /  d'[s]  then 

G:=Gu{/^F|  /  depends  on  some  i  in  s  such  that  d[i]  ^ 
c/[s]  :=  d'[s] 

n 

od 

The  following  observation  will  be  useful  in  the  proof  of  correctness  of  this 
algorithm. 

Note  8.  Consider  the  partial  orders  (D^,  C  i),  for  i  €  [l-.n],  o,  scheme  s  on  n 
and,  a  function  f  with  scheme  s.  Then 

(i)  f  is  inflationary  iff  f'^  is, 

(a)  f  is  monotonic  iff  f^  is. 


The  following  result  summarizes  the  properties  of  the  Cl  algorithm. 
Theorem  9. 

(i)  Every  terminating  execution  of  the  Cl  algorithm  computes  in  d  the  least  fixed 
point  of  the  function  f  on  D  defined  by 

/W  :=UA+W- 

i=l 

(ii)  If  all  {Di,  C  i),  where  i  €  [l..n],  satisfy  the  finite  chain  property,  then  every 
execution  of  the  Cl  algorithm  terminates. 

Proof.  It  is  simpler  to  reason  about  a  modified,  but  equivalent,  algorithm  in 
which  the  assignments  d'[s]  :=  5f(d[s])  and  d[s]  :=  d'[s]  are  respectively  replaced 
hy  d'  :=  g^{d)  and  d  d'  and  the  test  d[s]  7^  d'[s]  hy  d  ^  d' . 

{i)  Note  that  the  formula 

I:=\/feF-Gf+{d)  =  d 

is  an  invariant  of  the  while  loop  of  the  modified  algorithm.  Thus  upon  its  ter- 
mination 


(G  =  0)  A  / 
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holds,  that  is 

V/€F/+(d)  =  d. 

Consequently,  some  chaotic  iteration  of  F+  eventually  stabilizes  at  d.  Hence  d 
is  the  least  fixpoint  of  the  function  /  defined  in  item  (z)  because  the  Chaotic 
Iteration  Corollary  7  is  applicable  here  by  virtue  of  Note  8(i)  and  (ii). 

(n)  Consider  the  lexicographic  order  of  the  partial  orders  (-0,^)  and  (N,  <), 
defined  on  the  elements  of  D  x  N  by 

(di,ni)  <1^:,  (^2,712)  iff  di  □  d2  or  {di  =  ^2  and  ni  <  712). 

We  use  here  the  inverse  order  □  and  N  denotes  the  set  of  natural  numbers. 

By  Note  8(i)  all  functions  are  inflationary,  so  with  each  while  loop  iter¬ 
ation  of  the  modified  algorithm  the  pair 

{d,  card  G) 

strictly  decreases  in  this  order  <iex-  Howver,  in  general  the  lexicographic  order 
{D  X  N,<iex)  is  not  well-founded  and  in  fact  termination  is  not  guaranteed. 
But  assume  now  additionally  that  each  partial  order  {Di^  Q  i)  satisfies  the  fi¬ 
nite  chain  property.  Then  so  does  their  Cartesian  product  {D,  H  ).  This  means 
that  (Z),  H)  is  well-founded  and  consequently  so  is  {D  x  N,  <iex)  which  implies 
termination.  D 

When  all  considered  functions  fi  are  also  idempotent,  we  can  reverse  the 
order  of  the  two  assignments  to  G,  that  is  to  put  the  assignment  G  :=  G  —  {^r} 
after  the  if-then-fi  statement,  because  after  applying  an  idempotent  function 
there  is  no  use  in  applying  it  immediately  again.  Let  us  denote  by  Cl  I  the 
algorithm  resulting  from  this  movement  of  the  assignment  G  G  —  {g}. 

More  specialized  versions  of  the  Cl  and  CII  algorithms  can  be  obtained  by 
representing  G  as  a  queue.  To  this  end  we  use  the  operation  enqueue (F,  Q) 
which  for  a  set  F  and  a  queue  Q  enqueues  in  an  arbitrary  order  all  the  elements 
of  F  in  Q,  denote  the  empty  queue  by  empty,  and  the  head  and  the  tail  of  a  non¬ 
empty  queue  Q  respectively  by  head(Q)  and  tail(Q).  The  following  algorithm 
is  then  a  counterpart  of  the  Cl  algorithm. 

Generic  Chaotic  Iteration  Algorithm  with  a  Queue  (CIQ) 

n  times 

d'  :=  d; 

Q  :=  empty; 
enqueue (F,  Q); 
while  Q  ^  empty  do 

g  head(Q);  suppose  g  is  with  scheme  s; 

Q  :=  tail(Q); 
d'[s]  :=  3(d[s]); 
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if  d[s]  7^  d^[s]  then 

enqueue({/  E  F  \  f  depends  on  some  i  in  s  such  that  d[i]  ^  d'[i]},Q); 
<i[s]  :=  d'[s] 

fi 

od 


Denote  by  CIIQ  the  modification  of  the  CIQ  algorithm  that  is  appropriate  for 
the  idempotent  functions,  so  the  one  in  which  the  assignment  Q  tail(Q)  is 
performed  after  the  if-then-fi  statement. 

It  is  easy  to  see  that  the  claims  of  Theorem  9  also  hold  for  the  CII ,  CIQ  and 
CIIQ  algorithms.  A  natural  question  arises  whether  for  the  specialized  versions 
CIQ  and  CIIQ  some  additional  properties  can  be  established.  The  answer  is  pos¬ 
itive.  Namely,  for  these  two  algorithms  the  following  result  holds  which  shows 
that  the  nondeterminism  present  in  these  algorithms  has  no  bearing  on  their 
termination. 

Theorem  10.  If  some  execution  of  the  CIQ  algorithm  terminates,  then  all  the 
executions  of  the  CIQ  algorithm  terminate. 

Proof.  We  first  establish  the  following  observation. 

Claim  1  If  some  chaotic  iteration  of  F”*”  eventually  stabilizes,  then  all  the  exe¬ 
cutions  of  the  CIQ  algorithm  terminate. 

Proof.  We  prove  the  contrapositive.  Consider  an  infinite  execution  of  the  CIQ  al¬ 
gorithm  algorithm.  Let  ii,  Z2, .  •  •  be  the  run  associated  with  it  and  ^  do,di, . . . 
the  iteration  of  F+  associated  with  this  run.  By  the  structure  of  this  algorithm 

^  does  not  stabilize.  (3) 

Let  A  be  the  set  of  the  elements  of  {l..k]  that  appear  finitely  often  in  the  run 
For  some  m  >  0  we  have  ij  ^  A  for  j  >  m.  This  means  by  the 
structure  of  this  algorithm  that  after  m  iterations  of  the  while  loop  no  function 
fi  for  z  G  A  is  ever  present  in  the  queue  Q. 

By  virtue  of  the  invariant  I  used  in  the  proof  of  Theorem  9  we  then  have 
)  =  dj  for  2  €  A  and  j  >  m.  This  allows  us  to  transform  the  iteration  ^  to 
a  chaotic  one  by  repeating  each  element  dj  for  j  >m  card  A  times. 

Assume  now  that  a  chaotic  iteration  of  eventually  stabilizes.  Then  by  the 
Chaotic  Iteration  Corollary  7  the  just  constructed  chaotic  iteration  stabilizes,  as 
well.  So  the  original  iteration  f  also  stabilizes  which  contradicts  (3).  □ 

Construct  now  a  chaotic  iteration  of  F"*"  the  initial  prefix  of  which  corre¬ 
sponds  with  a  terminating  execution  of  the  CIQ  algorithm.  By  virtue  of  the 
invariant  I  this  iteration  eventually  stabilizes.  This  concludes  the  proof  thanks 
to  Claim  1.  ^ 

An  analogous  result  holds  for  the  CIIQ  algorithm.  On  the  other  hand,  it  is 
easy  to  see  that  this  result  does  not  hold  for  the  Cl  and  CII  algorithms. 
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3  Constraint  Propagation 

Let  us  return  now  to  the  study  of  CSP’s.  We  show  here  how  the  results  of  the 
previous  section  can  be  used  to  explain  the  constraint  propagation  process. 


3.1  Domain  Reduction 

In  this  subsection  we  study  the  domain  reduction  process.  First  we  associate 
with  each  CSP  an  U-po  that  “focuses”  on  the  domain  reduction. 

Consider  a  CSP  V  :=  (Di, . . .,  C).  Let  for  X,Y  Q  Di 

X  mXDY. 

Then  for  i  G  [l..n]  (P(Di),  C  j)  is  an  U-po  with  _Li  =  Di  and  X  UiY  =  X  DY . 
Consequently,  the  Cartesian  product  {DO,  C  )  of  (V{Di),  Q  i),  where  i  G  [l-.n], 
is  also  an  U-po.  We  call  {DO,  C  )  the  domain  U-po  associated  with  V. 

As  in  in  Subsection  2.2,  for  a  scheme  s  ~  i\, . .  .,ii  we  denote  by  {DOg,  C  5) 
the  Cartesian  product  of  the  partial  orders  {V{Di.),  C  i^.),  where  j  G  [1../]. 

Note  that  DOg  -  'P{Di^)  x  •  •  ■  x  V{Di^).  Because  we  want  now  to  use  con¬ 
straints  in  our  analysis  and  constraint  are  sets  of  tuples,  we  identify  DOg  with 
the  set 

{Xi  X  -  ’  X  Xi\  Xj  C  Di.  for  j  G  [Li]}. 

In  this  way  we  can  write  the  elements  of  DOg  as  Cartesian  products  XiX  -  •  -  xXi, 
so  as  (specific)  sets  of  /-tuples,  instead  of  as  (Xi, . . .,  Xi),  and  similarly  with  DO. 

Note  that  because  of  the  use  of  the  inverse  subset  order  D  we  have  for  Xi  x 
■  •  •  X  X/  G  DOg  and  Li  x  •  •  •  x  Y;  G  DOg 

Xi  X  •  •  •  X  X  C  5^1  X  •  •  •  X  iff  Xi  X  •  •  •  X  X  D  Fi  X  •  •  •  X  y^ 

(iff  X  5  Li  for  i  G  [1../]), 


(X  X  •  ■  ■  X  X)  u,  (x  X  ■  • .  X  yo  =  (Xi  X  •  •  •  X  X)  n  (X  X  •  ■  •  X  X) 

(=  (XinX)x..-x(xnL)). 


Moreover,  Di  x  •  •  •  x  is  the  least  element  of  DO. 

So  far  we  have  defined  an  U-po  associated  with  a  CSP.  Next,  we  introduce 
functions  by  means  of  which  chaotic  iterations  will  be  generated.  These  functions 
are  associated  with  constraints.  Constraints  are  arbitrary  sets  of  fc-tuples  for 
some  k,  while  the  U  5  order  and  the  U^  operation  are  defined  only  on  Cartesian 
products.  So  to  define  these  functions  we  use  the  set  theoretic  counterparts  U 
and  n  of  C  5  and  U^  which  are  defined  on  arbitrary  sets. 

Definition  11.  Consider  a  sequence  of  domains  Di, . . .,  Dn  and  a  scheme  5  on 
n.  By  a  domain  reduction  function  for  a  constraint  C  with  scheme  s  we  mean  a 
function  /  on  DOg  such  that  for  all  D  G  DOg 

-  DD/(D), 

-CnD  =  Cn/(D).  a 
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The  first  condition  states  that  /  reduces  the  “current”  domains  associated 
with  the  constraint  C  (so  no  solution  to  C  is  “gained”),  while  the  second  condi¬ 
tion  states  that  during  this  domain  reduction  process  no  solution  to  C  is  “lost” . 
In  particular,  the  second  condition  implies  that  if  (7  C  D  then  C  C  /(D). 

Note  that  for  the  partial  order  (DO^,  E  s)  a  function  /  on  DOs  is  inflationary 
iff  D  3  /(D)  and  /  is  monotonic  iff  it  is  monotonic  w.r.t.  the  set  inclusion. 

Example  1.  As  a  simple  example  of  a  domain  reduction  functions  consider  a 
binary  constraint  C  C  Di  x  D2-  Define  now  the  functions  /i  and  /2  on  DOi^2  '‘— 
V{Di)  X  V{D2)  as  follows: 

hiXxY)  :=X'xY, 

where  X'  =  {a  £  X  \  3b  E  Y  (a,  b)  £  C},  and 

f2{X  xY):=Xx  Y\ 

where  Y'  =  {b  £  Y  \  3a  £  X  {a,b)  £  C}.  It  is  straightforward  to  check  that  /i 
and  /2  are  indeed  domain  reduction  functions.  Further,  these  functions  are  mono¬ 
tonic  w.r.t.  the  set  inclusion  and  idempotent.  C 

Take  now  a  CSP  V  (Di, . .  .^Dn]C)  and  a  sequence  of  domains  DJ , . . ., 
such  that  D[  C  Di  for  i  £  [l..n].  Consider  a  CSP  V  obtained  from  V  by  replacing 
each  domain  D[  by  Di  and  by  restricting  each  constraint  in  C  to  these  new 
domains.  We  say  then  that  V'  is  determined  by  V  and  D[x  .  x 

Consider  now  a  CSP  V  (Di, . .  .,Dn]C}  and  a  domain  reduction  function 
/  for  a  constraint  (7  of  C.  Suppose  that 

/+{_Di  X  ■  ■  ■  X  Dn)  =  D'l  X  ■  ■  ■  X  D'„, 

where  /+  is  the  canonic  extension  of  /  to  DO  defined  in  Subsection  2.2.  We  now 
define  f{V)  to  be  the  CSP  determined  by  V  and  D[  x  ...  x  D'^.  The  following 
observation  holds. 

Lemma  12.  Consider  a  CSP  V  and  a  domain  reduction  function  /.  Then  V 
and  f{V)  are  equivalent. 

Proof.  Suppose  that  Di, . . .,  are  the  domains  of  V  and  assume  that  /  is  a 
domain  reduction  function  for  C  with  scheme  ii, . . .,  ii-  Let 

f{Di^  X  •••  X  Di,)=Dl^  X  •••  X  Dl^. 

Take  now  a  solution  d  to  V.  Then  d[ii,...,ii\  £  (7,  so  by  the  definition  of  / 
also  d[ii, . . .,  ii]  £  D'-^  x  ■  -  x  D'-^.  So  d  is  also  a  solution  to  f{V).  The  converse 
implication  holds  by  the  definition  of  a  domain  reduction  function.  □ 

When  dealing  with  a  specific  CSP  we  have  in  general  several  domain  re¬ 
duction  functions.  To  study  their  interaction  we  can  use  the  Chaotic  Iteration 
Theorem  6  in  conjunction  with  the  above  Note.  After  translating  the  relevant 
notions  into  set  theoretic  terms  we  get  the  following  direct  consequence  of  these 
results.  (In  this  translation  DOg  corresponds  to  Dg  and  DO  to  D.) 
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Theorem  13  (Domain  Reduction).  Consider  a  CSP  V  {Di, . .  .,Dn\C). 
Let  F  {/i,  •  •  •, //c}?  where  each  fi  is  a  domain  reduction  function  for  some 
constraint  inC.  Suppose  that  all  functions  fi  are  monotonic  w.r.t.  the  set  inclu¬ 
sion.  Then 

—  the  limit  of  every  chaotic  iteration  of  {/j^,  •  •  •,  /^}  exists; 

—  this  limit  coincides  with 

oo 

f] /^(Di  X  . .  •  X  Z?„), 

j=0 

where  the  function  f  on  DO  is  defined  by: 

/(D)  :=f|/+(D). 

i=l 

—  the  CSP  determined  by  V  and  this  limit  is  equivalent  to  V.  □ 

Informally,  this  theorem  states  that  the  order  of  the  applications  of  the  do¬ 
main  reduction  functions  does  not  matter,  as  long  as  none  of  them  is  indefinitely 
neglected. 

Consider  now  a  CSP  V  and  suppose  that  the  domain  U-po  associated  with 
it  satisfies  the  finite  chain  property.  Then  we  can  use  the  Cl,  CII,  CIQ  and 
Cl  IQ  algorithms  to  compute  the  limits  of  the  chaotic  iterations  considered  in 
the  above  Theorem.  We  shall  explain  in  Subsection  4.1  how  by  instantiating 
these  algorithms  with  specific  domain  reduction  functions  we  obtain  specific 
algorithms  considered  in  the  literature.  In  each  case,  by  virtue  of  Theorem  9  and 
its  reformulations  for  the  CII,  CIQ  and  CIIQ  algorithms,  we  can  conclude  that 
these  algorithms  compute  the  greatest  common  fixpoint  w.r.t.  the  set  inclusion 
of  the  functions  from  F+. 

3.2  Constraint  Reduction 

We  now  study  the  constraint  reduction  process.  As  in  the  previous  subsection 
we  begin  by  associating  with  each  CSP  an  U-po  that  “focuses”  on  the  constraint 
reduction. 

Consider  a  CSP  V  :=  {V;  Ci, . . .,  Ck).  Let  for  X,YCCi 

X  C  iY  m  X  DY. 

Let  now  (CO,  C  )  be  the  Cartesian  product  of  the  U-po’s  {'P{Ci),  U  i),  where 
i  E  [l..n].  We  call  (CO,  U  )  the  constraint  U-po  associated  with  V. 

Following  the  notation  of  the  previous  subsection,  for  a  scheme  s  ii, . .  .,ii 
on  k  we  denote  by  (COg,  C  the  Cartesian  product  of  the  partial  orders 
(V{Ci.),  U  i.),  where  j  £  [l-d],  and  identify  CO^  with  the  set 

{Xi  X  X  AH  XjCCi^  for  j  E  [1../]}, 
and  similarly  with  CO. 

Next,  we  define  functions  that  will  be  used  to  generate  chaotic  iterations. 
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Definition  14.  Consider  a  CSP  (X>;  Ci, . . Cfc>  and  a  scheme  s  on  k.  By  a  con¬ 
straint  reduction  function  with  scheme  s  we  mean  a  function  g  on  CO ^  such  that 
for  all  C  e  COs 

-  CDg{C), 

-  Sol(C)  =  Sol{g{C)).  ° 

C  is  here  a  Cartesian  product  of  some  constraints  and  in  the  second  condition 
and  in  the  example  below  we  identified  it  with  the  sequence  of  these  constraints, 
and  similarly  with  ^'(C).  The  first  condition  states  that  g  reduces  the  constraints 
where  i  is  an  element  of  s,  while  the  second  condition  states  that  during  this 
constraint  reduction  process  no  solution  to  C  is  lost. 

Exam, pie  2.  As  an  example  of  a  constraint  reduction  function  consider  the  fol¬ 
lowing  function  g  on  some  COg' 

g{C  X  C)  :=  C'  X  C, 

where  C  =  nt{Sol{C,  C))  and  t  is  the  scheme  of  C.  To  see  that  g  is  indeed  a 
constraint  reduction  function,  first  note  that  by  the  definition  of  Sol  we  have 
C'  C  C,  so  C  X  C  2  g{C  x  C).  Next,  note  that  for  d  G  Sol{C,  C)  we  have  d[t]  G 
nt{Sol{C,  C)),  so  de  Sol{C\  C).  This  implies  that  Sol{C,  C)  =  Sol{g{C,  C)). 
Note  also  that  g  is  monotonic  w.r.t,  the  set  inclusion  and  idempotent.  □ 

Exam.ple  3.  As  another  example  that  is  of  importance  for  the  discussion  in  Sub¬ 
section  4.1  consider  a  CSP  (Di, . . .,  C)  of  binary  constraints  such  that  for 
each  scheme  z,  j  on  n  there  is  exactly  one  constraint,  which  we  denote  by  Ci^j. 

Define  now  for  each  scheme  k,  l,m  onn  the  following  function  g'^i  on  COg, 
where  s  is  the  triple  corresponding  to  the  positions  of  the  constraints  Ck,hCk,m 
and  Cm^i  in  C: 


g^li^Xk.l  ^  ^k,m  ^  ^m,l)  ■ —  {^k,l  C  ^  ^  ^ 

To  prove  that  the  functions  are  constraint  reduction  functions  it  suffices 
to  note  that  by  simple  properties  of  the  M  operation  and  by  Note  2(i)  we  have 

Xk,l  n  nk,l{Xk^m  =  C[k,l{Xk,l  Xk,m  ^  ^m,l) 

so  these  functions  are  special  cases  of  the  functions  defined  in  Example  2.  □ 

Take  now  a  CSP  V  :=  {T>;  Ci, . . .,  Ck)  and  a  sequence  of  constraints  Cj, . . ., 
such  that  C'  C  Ci  for  i  G  [l..k].  Let  V'  (P;  . . .,  C^).  We  say  then  that  V' 

is  determined  hy  V  and  C[  x  . . .  x  Cj^. 

Consider  now  a  CSP  V  :=  (P;  Ci, . . Ck)  and  a  constraint  reduction  function 
g  with  scheme  s.  Suppose  that 

p+(Cix.-.xCfc)  =  c;x...xc^, 
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where  g'^  is  the  canonic  extension  of  g  to  CO  defined  in  Subsection  2.2.  We  now 
define 

g{V)  :=  (V-,C[,...,Ci). 

We  have  the  following  observation. 

Lemma  15.  Consider  a  CSP  V  and  a  constraint  reduction  function  g.  Then  V 
and  g{V)  are  equivalent. 

Proof.  Suppose  that  s  is  the  scheme  of  the  function  g  and  let  C  be  an  element 
of  COs-  C  is  a  Cartesian  product  of  some  constraints.  As  before  we  identify  it 
with  the  sequence  of  these  constraints.  For  some  sequence  of  schemes  s,  C  is 
the  s-sequence  of  the  constraints  of  V. 

Let  now  d  be  a  solution  to  V.  Then  by  Note  2(ii)  we  have  d[(s)]  G  Sol{C), 
so  by  the  definition  of  g  also  d[{s)]  G  Sol{g{C)).  Hence  for  every  constraint 
C  in  g{C>)  with  scheme  s'  we  have  d[s']  G  C  since  d[(s)][s']  =  d[s'\.  So  d  is  a 
solution  to  g{V).  The  converse  implication  holds  by  the  definition  of  a  constraint 
reduction  function. 

□ 

As  in  the  case  of  the  domain  reduction  we  can  now  apply  the  results  of 
Section  2  to  study  the  outcome  of  the  constraint  reduction  process.  To  this 
end  it  suffices  to  translate  the  relevant  notions  into  set  theoretic  terms.  (In  this 
translation  COs  corresponds  to  Ds  and  CO  to  D.)  We  get  then  the  following 
counterpart  of  the  Domain  Reduction  Theorem  13. 

Theorem  16  (Constraint  Reduction).  Consider  a  CSPV  :=  {V]Ci,.  ..,Cfc). 
Let  F  :=  {gi, . .  .,gk},  where  each  gi  is  a  constraint  reduction  function.  Suppose 
that  all  functions  gi  are  monotonic  w.r.t.  the  set  inclusion.  Then 

-  the  limit  of  every  chaotic  iteration  of  {gt  ^ . .  .^g'lf}  exists; 

-  this  limit  coincides  with 

oo 

x...xCfc), 

j-0 

where  the  function  g  on  CO  is  defined  by: 

k 

5(C)  :=n5+(C). 

i=l 

-  the  CSP  determined  by  V  and  this  limit  is  equivalent  to  V.  □ 

When  the  constraint  U-po  associated  with  a  CSP  V  satisfied  the  finite  chain 
property,  we  can  use  the  algorithms  discussed  in  Subsection  2.2  to  compute  the 
limits  of  the  chaotic  iterations  considered  in  the  above  Theorem,  We  return  to 
this  issue  in  Subsection  4.1.  Also  here,  as  in  the  previous  subsection,  we  can 
conclude  by  virtue  of  Theorem  9  that  these  algorithms  compute  the  greatest 
common  fixpoint  w.r.t.  the  set  inclusion  of  the  functions  from  F'*'.  So  the  limit  of 
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the  constraint  propagation  process  could  be  added  to  the  collection  of  important 
greatest  fixpoints  presented  in  Barwise  and  Moss  (1996). 

Next,  we  show  how  specific  provably  correct  algorithms  for  achieving  a  local 
consistency  notion  can  be  automatically  derived.  As  it  is  difficult  to  define  local 
consistency  formally,  we  illustrate  the  idea  on  an  example. 

Exam, pie  4-  We  consider  here  the  notion  of  relational  consistency  proposed  re¬ 
cently  in  Dechter  and  van  Beek  (1997). 

To  define  it  need  to  introduce  some  auxiliary  concepts  first.  Consider  a  CSP 
(Hi, .  . .,  Dn',  C).  Take  a  scheme  t  ii,. . ii  on  n.  We  call  d  G  x  •  •  •  x  Di^  a 
tuple  of  type  t  and  say  that  d  is  consistent  if  for  every  subsequence  s  of  t  and  a 
constraint  C  eC  with  scheme  s  we  have  d[s]  G  C. 

A  CSP  V  is  called  relationally  m-consistent  if  for  any  s-sequence  Ci  , . . ., 
of  different  constraints  of  V  and  a  subsequence  t  of  (s),  every  consistent  tuple  of 
type  t  belongs  to  IX  •  •  *  M  Cm)- 

As  the  first  step  we  characterize  this  notion  as  a  common  fixed  point  of  a 
finite  set  of  monotonic  and  inflationary  functions. 

Consider  a  CSP  V  (Hi, . . .,  Ci, . . .,  C^}.  Assume  for  simplicity  that 
for  every  scheme  s  on  n  there  is  a  unique  constraint  with  scheme  s.  Each  CSP 
is  trivially  equivalent  with  such  a  CSP  —  it  suffices  to  replace  for  each  scheme 
s  the  set  of  constraints  with  scheme  s  by  their  intersection  and  to  introduce 
“universal  constraints”  for  the  schemes  without  a  constraint. 

Consider  now  a  scheme  zi, . . .,  im  on  Let  s  be  such  that  Cq, . . .,  Ci^  is 
an  s-sequence  of  constraints  and  let  t  be  a  subsequence  of  (s).  Further,  let  Ci^^ 
be  the  constraint  of  V  with  scheme  t.  Put  s  :=  ((^o),  •  •  •)  ^m)}-  (Note  that 

if  io  does  not  appear  in  ii, . .  .,im  thon  s  =  ^o,U,  ■  ■  -,im  nnd  otherwise  s  is  the 
permutation  of  ii, . .  .,^77^  obtained  by  transposing  zq  with  the  first  element.) 

Define  now  a  function  pg  on  COg  by 

gg{C  X  C)  :=  (Cni7i(M  C))  X  C. 

It  is  easy  to  see  that  if  for  each  function  ps  of  the  above  form  we  have 
pt{Ci  X  •  •  •  X  Cfe)  =  Cl  X  •  •  •  X  Cfc, 

then  V  is  relationally  m-consistent.  (The  converse  implication  is  in  general  not 
true).  Note  that  the  functions  Ps  are  inflationary  and  monotonic  w.r.t.  the  inverse 
subset  order  D  and  also  idempotent. 

Consequently,  by  virtue  of  Theorem  9  reformulated  for  the  CII  algorithm, 
we  can  now  use  the  CII  algorithm  to  achieve  relational  m-consistency  for  a  CSP 
with  finite  domains  by  “feeding”  into  this  algorithm  the  above  defined  functions. 
The  obtained  algorithm  improves  upon  the  (authors’  terminology)  brute  force 
algorithm  proposed  in  Dechter  and  van  Beek  (1997)  since  the  useless  constraint 
modifications  are  avoided. 

As  in  Example  3,  by  simple  properties  of  the  M  operation  and  by  Note  2(i) 
we  have 


C  n  /7t(M  C)  =  nt{C  M  (M  C))  =  nt{sol{C,  C)). 
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Hence,  by  virtue  of  Example  2,  the  functions  Qs  are  all  constraint  reduction 
functions.  Consequently,  by  the  Constraint  Reduction  Theorem  16  we  conclude 
that  the  CSP  computed  by  the  just  discussed  algorithm  is  equivalent  to  the 
original  one.  ^ 

It  is  perhaps  worthwhile  to  note  that  the  domain  reduction  process  can  be 
seen  as  a  special  case  of  the  constraint  reduction  process.  To  this  end  it  suffices 
to  introduce  unary  constraints  each  of  which  coincides  with  a  different  domain 
of  the  given  CSP  and  replace  the  reduction  of  the  domains  by  the  reduction  of 
these  unary  constraints  followed  by  the  restriction  of  the  other  constraints  to 
these  reduced  unary  constraints.  So  the  domain  reduction  functions  can  be  seen 
as  special  cases  of  the  constraint  reduction  functions. 

We  decided  to  consider  the  domain  reduction  process  separately,  because,  as 
we  shall  see  in  the  next  section,  it  has  been  extensively  studied,  especially  in 
the  context  of  CSP’s  with  binary  constraints  and  of  interval  arithmetic.  Con¬ 
sequently,  it  is  useful  to  analyze  it  directly,  without  any  introduction  of  new 
constraints. 


4  Concluding  Remarks 

4.1  Related  Work 

It  is  illuminating  see  how  the  attempts  of  finding  general  principles  behind  the 
constraint  propagation  algorithms  repeatedly  reoccur  in  the  literature  on  con¬ 
straint  satisfaction  problems  spanning  the  last  twenty  years. 

As  already  stated  in  the  introduction,  the  aim  of  the  constraint  propagation 
algorithms  is  most  often  to  achieve  some  form  of  local  consistency.  As  a  result 
these  algorithms  are  usually  called  in  the  literature  “consistency  algorithms”  or 
“consistency  enforcing  algorithms” . 

To  start  with,  in  Mackworth  (1977)  a  unified  framework  was  proposed  to  ex¬ 
plain  the  so-called  arc-  and  path-consistency  algorithms.  Also  the  arc-consistency 
algorithm  AC-3  and  the  path-consistency  algorithm  PC-2  were  proposed  and  the 
latter  algorithm  was  obtained  from  the  former  one  by  pursuing  the  analogy 
between  both  notions  of  consistency. 

The  AC-3  consistency  algorithm  can  be  obtained  by  instantiating  the  Cl  I 
algorithm  with  the  domain  reduction  functions  defined  in  Example  1,  whereas 
the  PC- 2  algorithm  can  be  obtained  by  instantating  this  algorithm  with  the 
domain  reduction  functions  defined  in  Example  3. 

In  Dechter  and  Pearl  (1988)  the  notions  of  arc-  and  path-consistency  were 
modified  to  directional  arc-  and  path-consistency,  versions  that  take  into  account 
some  total  order  <d  of  the  domain  indices,  and  the  algorithms  for  achieving 
these  forms  of  consistency  were  presented.  These  algorithms  can  be  obtained  as 
instances  of  the  CIQ  algorithm  as  follows. 

For  the  case  of  directional  arc-consistency  the  queue  in  this  algorithm  should 
be  instantiated  with  the  set  of  the  domain  reduction  functions  /i  of  Example  1 
for  the  constraints  the  scheme  of  which  is  consistent  with  the  <d  order.  These 
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functions  should  be  ordered  in  such  a  way  that  the  domain  reduction  functions 
for  the  constraint  with  the  <d-large  second  index  appear  earlier.  This  order 
has  the  effect  that  the  enqueue  operation  within  the  if-then-fi  statement  has 
always  the  empty  set  as  the  first  argument,  so  it  can  be  deleted.  Consequently, 
the  algorithm  can  be  rewritten  as  a  simple  for  loop  that  processes  the  selected 
domain  reduction  functions  fi  in  the  appropriate  order. 

For  the  case  of  directional  path-consistency  the  constraint  reduction  functions 
;  should  be  used  only  for  kj  <dm  and  the  queue  in  the  CIQ  algorithm  should 
be  initialized  in  such  a  way  that  the  functions  with  the  <rf-large  m  index 
appear  earlier.  As  in  the  case  of  directional  arc-consistency  this  algorithm  can 
be  rewritten  as  a  simple  for  loop. 

In  Montanari  and  Rossi  (1991)  a  general  study  of  constraint  propagation  was 
undertaken  by  defining  the  notion  of  a  relaxation  rule  and  by  proposing  a  general 
relaxation  algorithm.  The  notion  of  a  relaxation  rule  coincides  with  our  notion 
of  a  constraint  propagation  function  instantiated  with  the  functions  defined  in 
Example  2  and  the  general  relaxation  algorithm  is  the  corresponding  instance 
of  our  Cl  algorithm. 

In  Montanari  and  Rossi  (1991)  it  was  also  shown  that  the  notions  of  arc- 
consistency  and  path-consistency  can  be  defined  by  means  of  relaxation  rules 
and  that  as  a  result  arc-consistency  and  path-consistency  algorithms  can  be 
obtained  by  instantiating  with  these  rules  their  general  relaxation  algorithm. 

Van  Hentenryck,  Deville  and  Teng  (1992)  presented  a  generic  arc  consistency 
algorithm,  called  AC- 5,  that  can  be  specialized  to  the  known  arc-consistency 
algorithms  AC-3  and  AC-4  and  also  to  new  arc-consistency  algorithms  for  specific 
classes  of  constraints. 

In  Benhamou,  McAllester  and  Hentenryck  (1994)  and  Benhamou  and  Older 
(1997)  specific  functions,  called  narrowing  functions,  were  associated  with  con¬ 
straints  in  the  context  of  interval  arithmetic  for  reals  and  some  properties  of 
them  were  established  that  in  our  terminology  mean  that  these  are  idempo- 
tent  domain  reduction  functions.  As  a  consequence  the  algorithms  proposed  in 
these  papers,  called  respectively  a  fixpoint  algorithm  and  a  narrowing  algorithm, 
become  respectively  the  instances  of  our  CIIQ  algorithm  and  Cl  I  algorithm. 

The  importance  of  fairness  for  the  study  of  constraint  propagation  was  no¬ 
ticed  in  Montanari  and  Rossi  (1991),  while  the  relevance  of  the  chaotic  iteration 
was  independently  noticed  in  Fages,  Fowler  and  Sola  (1996)  and  van  Emden 
(1996).  In  the  latter  paper  the  generic  chaotic  iteration  algorithm  Cl  I  was  formu¬ 
lated  and  proved  correct  for  the  domain  reduction  functions  defined  in  Benhamou 
and  Older  (1997)  and  it  was  shown  that  the  limit  of  the  constraint  propagation 
process  for  these  functions  is  their  greatest  common  fixpoint. 

The  idea  that  the  meaning  of  a  constraint  is  a  function  (on  a  constraint  store) 
with  some  algebraic  properties  was  put  forward  in  Saraswat,  Rinard  and  Panan- 
gaden  (1991),  where  the  properties  of  being  inflationary  (called  there  extensive), 
monotonic  and  idempotent  were  singled  out. 

It  is  unrealistic  to  expect  that  all  constraint  propagation  algorithms  presented 
in  the  literature  can  be  expressed  as  direct  instances  of  the  algorithms  discussed 
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in  this  paper.  For  example  the  AC-4  algorithm  of  Mohr  and  Henderson  (1986) 
associates  with  each  domain  element  some  information  concerning  its  links  with 
the  elements  of  other  domains.  As  a  result  this  algorithm  operates  on  some 
“enhancement”  of  the  original  domains. 

We  noted,  however,  that  even  in  this  case  the  analysis  here  provided  can 
be  used  to  explain  this  algorithm.  To  this  end  one  needs  to  reason  about  the 
translation  of  the  original  CSP  to  a  CSP  defined  on  the  enhanced  domains.  This 
analysis  allows  us  to  reduce  the  proof  of  the  correctness  of  this  algorithm  to  the 
proof  that  specific  functions  are  monotonic  domain  reduction  functions. 


4.2  Idempotence 

In  each  of  the  above  papers  the  (often  implicitly)  considered  semantic,  domain 
or  constraint  reduction  functions  are  idempotent,  so  we  now  comment  on  the 
relevance  of  this  assumption. 

To  start  with,  in  our  study  Apt  (1997)  of  linear  constraints  on  finite  integer 
intervals  we  found  that  natural  domain  reduction  functions  are  not  idempotent. 
Secondly,  as  noticed  in  Older  and  Vellino  (1993),  another  paper  on  constraints 
for  interval  arithmetic  on  reals,  we  can  always  replace  each  non-idempotent  in¬ 
flationary  function  /  by 

oo 

/*(x)  ;=  y  r(x). 

i^l 

The  following  is  now  straightforward  to  check. 

Note  17.  Consider  an  \J-po  (D,  C  )  and  a  function  f  on  D. 

—  If  f  is  inflationary,  then  so  is  f*. 

—  If  f  is  monotonic,  then  so  f*. 

—  If  f  is  inflationary  and  {D,  Q)  has  the  finite  chain  property,  then  f*  is 
idempotent. 

—  If  f  is  idempotent,  then  f*  =  f. 

—  Suppose  that  {D,  C  )  has  the  finite  chain  property.  Let  F  :=  {f\, . . .,  fk)  be 

a  set  of  inflationary,  monotonic  functions  on  D  and  let  F*  :=  {/j , . . .,  /^}. 
Then  the  limits  of  all  chaotic  iterations  of  F  and  of  F*  exist  and  always 
coincide.  ^ 

Consequently,  under  the  conditions  of  the  last  item,  every  chaotic  iteration 
of  F*  can  be  modeled  by  a  chaotic  iteration  of  F,  though  not  conversely.  In 
fact,  the  use  of  F*  instead  of  F  can  lead  to  a  more  limited  number  of  chaotic 
iterations.  This  may  mean  that  in  some  specific  algorithms  some  more  efficient 
chaotic  iterations  of  F  cannot  be  realized  when  using  F* . 


4.3  Semi-chaotic  Iterations 

The  results  of  this  paper  can  be  slightly  strengthened  by  considering  the  following 
generalization  of  the  chaotic  iterations. 
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Definition  18.  Consider  a  set  of  functions  F  :=  {/i, . . fk)  on  a  domain  D. 

-  We  say  that  an  element  i  G  is  eventually  irrelevant  for  an  iteration 

do,  di, . . .  of  F  if  3m  >  0  Vj  >  m  fi{dj)  ~  dj. 

-  An  iteration  of  F  is  called  semi-chaotic  if  every  i  E  [l..k]  that  appears  finitely 

often  in  its  run  is  eventually  irrelevant  for  this  iteration.  □ 

So  every  chaotic  iteration  is  semi-chaotic  but  not  conversely.  Now,  in  all  the 
results  of  this  paper  chaotic  iterations  can  be  replaced  by  semi-chaotic  iterations. 
The  reason  is  that,  as  shown  in  the  proof  of  Theorem  10,  every  semi-chaotic 
iteration  $  can  be  transformed  into  a  chaotic  iteration  with  the  same  limit 
and  such  that  ^  eventually  stabilizes  at  some  d  iff  does.  The  proof  of  Theorem 
10  also  shows  that  every  infinite  execution  of  the  CIQ  algorithm  is  associated 
with  a  semi-chaotic  iteration  of  F+. 

However,  the  property  of  being  a  semi-chaotic  iteration  cannot  be  determined 
from  the  run  only.  So,  for  simplicity,  we  decided  to  limit  our  exposition  to  chaotic 
iterations. 
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1.  Introduction 

Ever  since  Adleman’s  seminal  paper  [1]  there  has  been  a  flood  of  ideas  on  how  one  could  use 
DNA  to  compute.  Lipton  was  the  first  to  show  that  DNA  could  be  used  to  solve  more  than 
just  a  variation  of  the  famous  travelling  salesman  problem  [12].  Since  then  there  have  been 
many  other  papers  on  using  DNA  to  solve  various  computational  problems.  [3,5,4,6,7,15] 

At  the  top  level  all  these  papers  are  similar:  they  all  attempt  to  use  DNA  computation 
to  solve  some  large  search  problem.  Since  a  liter  of  water  can  hold  10^“  bases  of  DNA,  there 
is  the  possibility  that  one  can  outperform  electronic  machines. 

However,  this  is  currently  problematic.  There  are  several  reasons  for  this.  First,  elec¬ 
tronic  machines  are  very  fast;  moreover,  they  are  getting  faster  every  day.  Second,  there  are 
many  models  of  how  to  do  DNA  computations.  Yet,  it  is  unclear  if  any  of  these  models  will 
be  practical.  The  problem  is  mainly  that  DNA  technology  is  not  perfect.  DNA  operations 
are  not  error  free. 

Finally,  there  is  the  lack  of  a  killer  app.  A  killer  app  is  an  application  that  fits  the  DNA 
model;  cannot  be  solved  by  the  current  or  even  future  electronic  machines;  and  is  important. 
The  latter  is  critical:  to  be  a  killer  app  the  problem  must  be  one  for  which  people  are  willing 
to  “pay  money”  for  solutions.  To  date  there  are  no  viable  candidates  for  the  killer  app. 

We  propose  a  new  way  to  use  DNA  computations.  This  way  allows  us  to  use  DNA  com¬ 
putations  to  solve  important  and  potentially  killer  applications.  The  potential  applications 
include: 

{1)  DNA  sequencing; 

(5)  DNA  fingerprinting; 

(3)  DNA  mutation  detection  or  population  screening; 

(4)  Other  fundamental  operations  on  DNA. 

The  key  new  idea  is  to  use  DNA  computation  to  operate  on  unknown  pieces  of  DNA.  This  is 
a  fundamental  change  in  the  way  that  we  use  DNA  computation.  We  call  these  DNA'^DNA 
computations:  DNA  to  DNA  computations.  This  idea  was  first  proposed  in  [8]  and  called 
“analog”  DNA  computations  there. 

The  key  idea  is  the  following.  Suppose  that  one  has  a  test  tube  that  contains  multiple 
copies  of  some  unknown  strand  X  of  DNA.  By  unknown  we  mean  that  we  do  not  known 
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the  sequence  of  the  strand.  Suppose  further  that  we  wish  to  compute  some  property  of  X, 
i.e.  for  some  function  /()  we  wish  to  obtain  the  value  of  f{X).  The  current  way  to  do  this 
is:  (i)  sequence  the  strand  A"  in  the  laboratory;  (ii)  then,  determine  the  value  of  /(A)  on  a 
PC.  The  difficulty  with  this  method  is  that  it  requires  the  sequencing  of  the  strand  A . 

Our  new  idea  is  to  avoid  the  expensive  step  of  sequencing  the  strand  A".  In  particular, 
we  plan  to  operate  as  follows:  We  will  add  to  the  test  tube  certain  known  strands  of  DNA 
and  use  these  to  perform  a  DNA  computation  on  A.  The  result  of  this  computation  will  be 
the  answer  f{X). 

The  advantage  of  this  method  is  that  it  avoids  the  sequencing  step.  Our  hope  is  that 
this  direct  method  of  computing  with  unknown  strands  of  DNA  could  be  the  key  to  finding 
“killer  app’s”. 

There  is  one  huge  advantage  to  our  approach:  since  the  problems  we  are  solving  are 
not  digital,  there  is  no  way  that  electronic  machines  can  compete.  It’s  not  that  DNA 
based  computation  is  faster,  but  that  there  is  no  way  for  electronic  computers  to  do  the 
the  problems  at  all.  One  way  to  say  this  dramatically  is  that  there  is  no  place  on  a  PC  to 
“pour”  in  the  unknown  test  tube  of  DNA.  Without  input,  the  problem  cannot  be  solved  at 
all  on  a  PC. 

Our  method  is  based  on  a  new  transformation  that  allows  us  to  “encode”  an  unknown 
piece  of  DNA.  All  of  the  DNA  computations  to  date  use  special  redundant  codes.  It  is  critical 
that  the  DNA  be  redundantly  encoded.  Without  such  a  coding  the  computations  cannot 
be  performed.  Indeed  the  main  contributions  of  [1,12]  were  the  construction  of  methods  for 
creating  and  managing  such  codes. 

Of  course  naturally  occuring  DNA  is  not  coded  in  this  redundant  manner.  This  is  a 
major  roadblock:  without  codes  the  methods  of  DNA  computation  do  not  apply.  However, 
we  propose  a  method  that  allows  us  to  transform  DNA.  This  transformation  causes  the 
DNA  to  be  re-coded  into  any  redundant  code  that  we  choose. 

There  are  many  advantages  to  this  re-coding.  Mainly,  it  is  now  possible  to  apply  all  of 
the  “tricks”  of  DNA  computation  to  problems  that  involve  unknown  DNA.  Since  the  DNA 
is  coded  the  way  that  we  choose  we  can  operate  on  it  much  more  freely.  For  example,  one 
important  application  of  this  method  is  the  following:  (Note,  the  exact  theorem  statements 
are  in  section  3.) 

Theorem:  Suppose  that  X  and  Y  are  unknown  strands  in  distinct  test  tubes.  Then,  ii 
ts  possible  to  check  whether  or  not  X  =  Y  in  0(log(n))  bio-steps  where  both  strands  are  at 
most  length  n. 

Note,  we  mean  that  we  test  exactly  whether  or  not  A  and  Y  are  equal:  the  method  will 
discover  if  thev  differ  in  even  one  base.  Further,  this  is  only  a  simple  example  of  a  more 
general  type  of  theorem: 

Theorem:  Suppose  that  X^^\  . .  are  unknown  strands  of  length  at  most  n  that 

are  in  distinct  test  tubes.  Then,  in  0(\og(n))  bio-steps  we  can  compute  the  value  of 
F{X^^\  . . . ,  X^^^)  where  Fif)  is  an  NC^  function. 

It  is  important  to  point  out  that  our  results  avoid  one  of  the  key  difficulties  that  face 
“classic”  DNA  computations.  By  “classic”  we  mean  DNA  computations  that  attempt  to  do 
purely  digitial  problems.  The  advantage  is  that  our  results  are  much  more  error  tolerant. 
The  reason  is  that  in  classic  DNA  computations  there  is  often  “one  strand  that  the  exper¬ 
imenter  seeks  to  find.  In  our  new  type  of  DNA  computations,  there  are  many  many  copies. 
Thus,  small  error  rates  or  partial  rates  of  completion  for  some  of  the  operations  should  not 
be  a  problem. 

We  prove  these  results  by  combining  our  re-coding  methods  with  a  generalization  of 
the  pretty  simulation  method  of  Ogihara  and  Ray  [13].  Other  methods  could  be  used  but 
their  method  is  perfect  for  our  needs.  Note,  in  [2]  there  is  a  criticism  of  [13]  for  using 
an  unreaslistic  model.  We  feel  that  this  criticism  is  interesting  but  misses  the  essential 
point.  They  feel  that  the  cost  of  the  pour  operation  is  not  correctly  included  in  [13].  The 
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answer  seems  to  be  two-fold:  First,  even  if  the  methods  are  linear  in  “pour”  it’s  so  fast  that 
essentially  the  time  is  still  logarithmic.  Second,  one  can  imagine  using  robots  so  that  the 
pours  can  actually  be  done  all  at  once. 


2.  Model 

In  this  section  we  introduce  our  model  of  DNA  computations.  It  is  related  to,  but  fundamen¬ 
tally  different  from,  the  models  used  in  papers  on  classic  DNA  computations  [3,5,4,6,7,15]. 
The  key  point  is  that  in  DNA“DNA  computations  the  operations  need  not  work  perfectly. 
For  example,  we  will  only  require  assumptions  about  how  selective  DNA  is  when  single 
strands  anneal/ligate  together.  This  is  a  major  advantage  of  DNA^DNA  computations.  Of 
course  the  hope  is  that  this  weakening  in  the  required  models  will  make  DNA^DNA  com¬ 
putations  really  work  in  the  laboratory.  (Note,  we  are  just  beginning  experiments  in  Laura 
Landweber’s  laboratory  at  Princeton  University  that  we  hope  will  show  that  this  is  correct.) 

All  our  computations  are  described  in  terms  of  operations  that  are  performed  on  test 
tubes.  The  state  of  a  test  tube  is,  thus,  a  critical  concept.  At  any  time  a  test  tube  will  contain 
a  multi-set  of  different  pieces  of  DNA.  Some  pieces  will  be  single  strands,  some  double  strands 
and  others  more  complex  structures.  Clearly,  in  order  to  describe  mathematically  such  a 
state,  we  need  to  supply  the  following  information: 

(1)  The  types  of  pieces  of  DNA  that  are  in  the  test  tube; 

(2)  The  total  number  of  pieces  that  are  in  the  test  tube; 

(3)  The  number  of  pieces  of  each  type  that  are  in  the  test  tube. 

We  will  use  string  terminolgy  to  describe  single  strands  of  DNA.  More  precisely,  we  will 
identify  strings  S  over  the  alphabet  {A,  T,  C,  G}  with  the  single  strand  of  DNA  of  the  form: 

5'-5l,...,5n-3^ 

Also  by  the  Waison-Crick  complement  of  S  we  will  mean  the  string  that  is  the  reverse  of  S 
with  each  element  changed  into  its  complement,  i.e.  “A”  with  “T”  and  “C”  with  “G”. 

Suppose  that  a  test  tube  T  only  contains  single  strands  of  DNA:  note,  this  is  an  important 
special  case.  Clearly,  its  mathematical  definition  requires  that  we  supply  the  following: 

(f )  A  collection  of  that  correspond  to  the  single  strands  in  T,  i.e. 

{2)  A  integer  M  that  is  the  total  number  of  strands  in  T; 

{3)  A  collection  of  frequencies  . . . ,  5*  so  that  the  strand  5*^*^  occurs  qiM  times 
where  91  +  ...-}-  g*  =  T 

One  of  the  key  insights  about  DNA^DNA  computation  is  that  we  can  simplify  this  definition: 
we  do  not  need  to  supply  M.  That  is  we  need  not  worry  about  the  exact  number  of  strands 
that  are  in  the  test  tube.  We  need  only  to  keep  track  of  the  frequency  of  each  strand. 

This  is  an  important  point  about  the  difference  between  some  classic  DNA  computations 
and  DNA"DNA  computations.  In  classic  computations  the  number  of  types  of  strands  k  is 
the  same  order  of  magnitude  as  the  total  number  of  strands  M.  This  is  because  in  classic 
computations  each  strand  is  performing  a  separate  computation:  we  need  to  have  both  k 
and  M  as  large  as  possible. 

On  the  other  hand,  in  DNA^DNA  computations  k  will  often  be  relatively  small.  For 
example,  k  =  I,  000  and  M  =  10^^  are  quite  reasonable  parameters.  Since  M/k  is  so  large  we 
can  essentially  ignore  the  exact  value  of  M.  Of  course  it  is  critical  for  all  DNA  computations 
that  there  be  enough  material  available  to  make  the  operations  feasible.  Note,  if  in  some 
situation  M  became  too  small,  then  a  standard  “trick”  is  to  use  PCR  to  increase  the  number 
of  total  strands  and  thus  restore  M  to  a  large  enough  value. 

In  summary,  for  the  rest  of  the  paper  we  will  only  supply  the  frequencies  of  each  piece  not 
the  total  number  of  pieces  in  describing  a  test  tube.  A  common  situation  is  the  following: 
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Say  that  a  test  tube  T  contains  in  equal  amounts  provided  T  contains  the 

same  number  of  copies  of  each  the  given  single  strands  of  DNA. 

Now  let  us  turn  to  consider  the  class  of  operations  that  we  require:  (Each  is  a  bio-siep 

in  our  computations.) 

( J  )  Cut.  This  operation  cuts  or  cleaves  double  strands  of  DNA  at  a  certain  pattern. 

This  is  done  by  using  a  restriction  enzyme. 

(2)  Gel  Separate.  This  operation  uses  denaturing  polyacrylamide  gel  electrophoresis 
to  separate  DNA  molecules  by  length. 

(S)  Anneal.  This  operation  allows  single  strands  to  form  double  strands  based  on 
Watson-Crick  pairing,  i.e.  “A”  with  “T”  and  “C”  with  “G”. 

As  stated  earlier  we  do  not  assume  that  each  operation  works  perfectly.  Let  us  now  discuss 
the  exact  error  model  that  we  assume.  Let  r  >  0  be  a  fixed  small  constant;  we  expect  that 
it  will  be  smaller  than  10'^.  We  will  use  r  to  bound  the  error  rate  of  all  the  operations 
that  we  perform.  Note,  we  really  have  a  collection  of  r’s:  one  for  each  operation.  However, 
to  avoid  statements  that  are  overly  complex  we  will  lump  all  the  error  rates  together.  Of 
course,  one  can  in  principle  unravel  this  and  get  the  exact  dependence  on  each  error  rate,  if 
one  needs  finer  resolution. 

Now  let  us  turn  to  discuss  the  error  rates  of  each  type  of  operation.  A  cut  can  fail  in 
two  basic  ways.  First,  a  pattern  that  should  be  cut  may  not  be  cut.  Second,  some  place 
that  does  not  match  the  pattern  may  be  incorrectly  cut.  We  assume  that  at  least  1/2  of 
the  correct  sites  are  cut;  we  assume  that  at  most  r  of  the  incorrect  ones  are  cut.  Note, 
the  action  of  most  restriction  enzymes  are  usually  stated  in  terms  of  how  long  they  take  to 
cut  1/2  of  the  population.  One  can  increase  this  amount  by  either  adding  more  enzyme  or 
increasing  the  time  of  incubation. 

Next  let  us  discuss  the  separation  of  DNA  by  length.  As  in  other  papers  we  will  arrange 
things  so  that  no  separation  is  ever  required  to  separate  strands  that  are  too  close  in  length. 
Further,  we  will  arrange  it  so  that  the  lengths  are  quite  short.  Gel  methods  work  best  for 
very  short  lengths.  For  lengths  below  several  hundred  one  can  tell  i  from  i+l.  We  will 
assume  that  at  least  1/2  of  the  strands  of  the  given  length  are  correctly  extracted;  we  also 
assume  that  at  most  r  strands  of  the  wrong  length  are  also  extracted.  Note,  this  means  that 
we  do  not  assume  that  strands  are  not  lost  in  performing  the  gel.  As  long  as  approximately 
1/2  of  the  correct  strands  are  not  lost  the  operation  fits  our  model. 

Finally,  we  must  discuss  the  error  model  used  for  annealing.  This  is  the  most  complex. 
There  are  two  cases.  The  first  case  is  the  ‘Tar-apart”  case.  In  this  case  the  single  strands  ei¬ 
ther  exactly  Watson-Crick  bond  or  are  such  that  they  agree  in  at  most  1/4  of  their  positions. 
Furthermore  we  assume  that  the  length  of  the  match  is  above  a  fixed  threshold.  In  this  case 
we  assume  that  at  least  1/2  of  the  correct  pairs  form;  we  assume  that  effectively  none  of  the 
incorrect  ones  form.  Note,  that  we  are  implicitly  asssuming  that  there  are  enough  of  the 
DNA  strands  for  these  reactions  to  actually  take  place.  However,  we  have  already  stated 
that  there  will  always  be  “enough”  material. 

The  second  case  is  the  “near”  case.  In  this  case,  the  correct  and  incorrect  strands  agree 
in  more  than  1/4  of  their  positions.  Now  we  can  no  longer  assume  that  incorrect  pairs  will 
not  form.  For  example,  if  two  strands  a  and  0  are  Watson-Crick  complements  except  for 
one  position,  then  they  will  likely  bind  each  other.  This  is  even  more  likely  if  the  one  place 
they  differ  is  at  the  end.  In  this  case  they  will  bind  almost  as  well  as  a  perfectly  matched 
pair.  Thus,  in  this  case  we  cannot  assume  that  the  rates  of  formation  are  vastly  different 
for  the  correct  and  the  incorrect  case. 

In  this  case  we  make  the  following  weak  assumption.  We  assume  only  that  the  rate 
or  probability  for  a  perfect  match  is  strictly  bigger  than  that  of  a  partial  match.  This  is 
themodynaniically  reasonable;  More  matches  will  be  better.  We  make  no  assumption  about 
the  exact  difference.  However,  it  is  important  to  make  a  small  assumption  that  the  gap  is 
at  least  6  >  0  for  a  fixed  small  value  of  <5. 

In  summary,  the  error  model  is  as  follows: 
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Operation 

Correct 

Incorrect 

cut 

1/2 

T 

gel  separate 

1/2 

T 

anneal 

far-apart 

1/2 

0 

anneal 

near 

Pi 

P2 

where  all  that  is  claimed  is  that  in  each  case,  pi  is  strictly  larger  than  p2  by  an  amount  that 
is  at  least  8.  This  is  analogous  to  a  selection  coefficient.  The  big  surprise,  perhaps,  is  that 
we  can  assume  so  little  about  annealing  accuracy  in  the  near  case.  It  is  unclear  that  such  a 
weak  cLSSumption  is  enough  to  get  any  results.  However,  it  turns  out  that  it  is  enough:  how 
is  the  subject  of  the  next  section. 


3,  Re-Coding  of  Unknown  DNA 

In  this  section  we  will  show  how  to  re-code  an  unknown  strand  of  DNA  X  by  one  that  is 
coded  as  we  wish.  Suppose  that  we  have  a  test  tube  that  contains  X,  We  must  show  how  to 
create  a  new  test  tube  that  contains  a  re-coded  version  of  the  strand  X.  We  will  do  this  in 
two  stages.  Since  our  operations  are  only  approximate  we  will  not  be  able  to  do  this  exactly. 
Rather,  we  will  be  able  to  construct  a  test  tube  that  “approximates”  the  desired  one. 

Definition:  Let  test  tube  T  contain  the  single  strands  with  frequencies 

qi,...,qk  and  let  T'  contain  with  frequencies  q[,...,q[.  Then,  say  that  T 

e- approximates  T'  provided  \qi  ~  qi\  <  c. 

Next  a  string  definition: 

Definition:  A  string  a  is  in  the  string  S  provided  or  occurs  as  a  consecutive  substring  of 
5.  i.e.  that  for  some  i, 


a  —  Si, ,  iSt+f 

where  /  is  the  length  of  the  string  o.  A  string  a  is  in  the  strand  X  provided  o  is  in  X. 

There  are  two  “tricks”  that  allow  us  to  improve  the  quality  of  our  basic  operations.  The 
first  is  that  we  can  repeat  a  length  separation  multiple  times.  Clearly,  less  “correct”  DNA 
is  selected  but  also  less  “incorrect”  DNA  gets  by.  For  example,  the  following  is  useful: 

Lemma  1:  Suppose  that  test  tube  T  contains  equal  amounts  of  where  each 

string  is  a  different  length.  Then,  for  any  e  >  0  and  each  i  we  can  in  bio-steps 

construct  a  test  tube  that  is  an  e- approximation  to  the  test  tube  that  only  contains 

We  do  not  know  in  general  how  to  get  a  similar  lemma  for  cuts.  Repeating  a  cut,  for 
example,  will  cause  more  correct  material  to  be  cut  but  will  also  cut  more  incorrect  material. 
However,  there  is  a  very  important  case  where  we  can  essentially  do  this.  Suppose  that  we 
have  a  test  tube  T  and  we  plan  to  first  apply  a  cut  step  and  then  a  separation  step.  If  all 
the  pieces  from  the  cut  have  the  same  length  then  we  can  apply  Lemma  1  after  the  cut. 
The  effect  of  this  will  be  that  incorrectly  cut  material  will  be  filtered  out.  In  a  sense  we 
have  made  the  cut  appear  to  have  an  error  rate  of  e  rather  than  r. 
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Theorem  1:  Suppose  that  a  test  tube  contains  the  unknown  strand  X.  Also  let  i  =  0(log(n.)) 
and  let  /()  be  a  function  that  is  defined  on  length  I  strings.  Then,  for  any  e  >  0  in 
0(log(n)  +  log(l/e))  bio-steps  we  can  create  a  test  tube  V  that  is  an  e-approximation  to 
the  test  tube  that  contains  in  equal  amounts  the  strands  that  correspond  to  f{a)  where  a 
ranges  over  the  length  I  consecutive  substrings  of  X. 

Thus,  we  can  go  from  a  test  tube  that  contains  one  string  to  another  that  contains  re¬ 
coded  versions  of  all  the  consecutive  pieces  of  X.  This  is  not  enough  but  is  an  important 
first  step.  We  describe  the  complete  method  elsewhere  [8].  (Here  we  will  only  sketch  the 
proofs  of  the  thorems.  The  full  proofs  will  be  in  the  final  paper.) 

Proof  of  Theorem  1:  Recall  that  T  is  a  test  tube  that  contains  many  copies  of  the  strand 
X.  Our  plan  is  to  add  to  this  test  tube  additional  pieces  of  DNA.  These  will  be  from  a 
set  we  call  the  probe  set.  In  particular,  assume  that  we  have  already  created  the  following 
probe  set  in  another  test  tube.  The  probe  set  will  contain  for  each  string  a  of  length  I,  the 
strand  that  corresponds  to  the  Watson-Crick  complement  of  a/(a).  We  then  will  anchor 
the  strands  of  the  test  tube  T  to  a  surface  of  another  test  tube  T'.  Then,  add  the  probe  set 
and  allow  them  to  anneal.  Now,  wash  off  the  excess.  Next  elute  the  bound  probes  from  the 
solid  support.  Then,  allow  them  to  re-attach.  Repeat  these  steps,  i.e,  perform  a  molecular 
selection  procedure  and  call  this  collection  of  probes  T' . 

The  result  is  that  T'  will  contain  those  probes  that  survived  the  repeated  washing  steps. 
We  claim  that  these  will  be  almost  totally  the  correct  ones,  i.e.  a  probe  oif{a)  that  survives 
will  have  with  high  probability  a  in  X. 

Let  us  calculate  the  survival  probability  in  the  correct  and  the  incorrect  case.  In  the 
correct  case  the  probability  that  a  probe  survives  is  where  m  is  the  number  of  iterated 
cycles  of  selection  by  binding;  in  the  incorrect  case  it  is  .  Here  pi>  P2  +  S. 

Note,  this  assumes  that  only  the  a  part  of  the  probes  are  available  for  bonding.  We  can 
easily  arrange  this  in  a  number  of  ways.  The  simplest  is  to  add  additional  material  that 
block  the  f{a)  part  of  the  probes.  We  assume  that  this  is  done.  See  [8]  for  details. 

Since  p2  is  bounded  below  pi  for  m  =  0(log(n)  +  log(e))  we  will  expect  that  the  fraction 
of  incorrect  probes  that  survive  is  at  most  e. 

Finally,  we  can  arrange  the  probes  so  that  we  can  cut  away  the  /(a)  part.  Then,  provided 
we  have  arranged  that  the  length  of  the  f{a)  is  much  larger  than  /,  a  separation  yields  the 
desired  test  tube  of  DNA.  In  order  to  the  error  rate  low  we  use  Lemma  1  to  repeat  the 
separation.  | 

We  plan  now  to  use  Theorem  1  to  allow  us  to  re-code  the  whole  of  the  unknown  X. 
Note,  however,  that  already  the  transformation  is  quite  useful.  For  example,  in  [8]  we  show 
how  it  can  potentially  be  used  to  increase  the  power  of  “DNA  chips”.  These  chips  attempt 
to  sequence  unknown  DNA  via  hybridization. 

Our  plan  is  to  use  Theorem  1  to  build  a  special  test  tube  of  DNA.  It  will  contain  pieces 
for  each  I  consecutive  substring  of  X .  Moreover,  these  pieces  will  be  able  to  anneal  together 
to  form  the  encoding  of  X.  The  key  is  that  this  method  will  only  work  on  “reasonable” 
As.  The  problem  is  that  Theorem  1  only  allows  us  to  work  with  short  parts  of  X.  So  that 
we  need  X  to  have  the  property  that  it’s  determined  by  it’s  short  pieces.  If  X  is  not,  then 
this  approach  cannot  succeed. 

Definition:  Say  that  a  string  X  is  I- determined  provided  that  X  is  uniquely  reconstructible 
from  its  I  long  subsequences. 

Note,  if  A'  is  random  then  certainly  for  I  about  21og(n)  all  the  I  pieces  are  likely  to  be 
unique;  in  this  case  A'  is  trivially  /-determined.  However,  subsequences  can  be  repeated  and 
A'  can  still  be  /-determined.  This  notion  is  already  in  use  in  DNA  chips  [16].  The  method  of 
sequencing  via  hybridization  only  can  work  for  sequences  that  are  /-determined  for  a  small 
value  of  /.  Also,  we  do  not  require  that  it  is  easy  to  find  X  from  its  pieces,  only  that  it  is 
po.ssible. 
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Let  us  fix  two  functions  hiiP,  i)  and  0  where  ^5  is  a  string  of  length  f  —  1  and  i  is 

from  1  to  n.  These  functions  are  the  “re-coding”  or  “hashing”  functions  that  we  will  use. 
VVe  assume  that  they  hash  values  so  that  distinct  values  in  their  range  agree  in  at  most  1/4 
of  their  positions.  Thus,  one  can  think  of  them  as  assigning  a  hash  to  a  length  /  —  1  string 
and  an  index  i. 

Theorem  2:  Suppose  that  T  is  a  test  tube  with  unknown  DNA  X  that  is  l-deiermined  for 
I  =  0{log(n))  where  n  is  Us  length.  Suppose  also  that  hiQ  and  are  hash  functions 
as  above.  Then,  for  any  e  >  0  in  0(log(n)  -|-  log(e))  bio-steps  we  can  form  the  test  tube  T' 
that  is  an  e-approximation  to  the  test  tube  that  contains  in  equal  amounts  the  strands  of  the 
form: 


where  /?[z]  is  the  substring  of  X  of  length  /  —I. 

Proof  of  Theorem  2:  Again  we  plan  only  to  sketch  the  proof.  The  basic  idea  is  as  follows. 
Suppose  that  /?[i]  is  the  /  —  1  long  substring  of  X  starting  at  index  i.  Then,  we  plan  to  put 
into  a  test  tube  the  following  pieces  of  DNA  for  each  index  i  in  the  range  1  to  n: 

(1)  if  i  is  odd,  hL{P[ifi)hR{(3[i-\-  1); 

(2)  if  i  is  even,  the  Watson-Crick  complement  of  hL{(3[i],  i)hR{ld[i  +  1],  f  -|-  1). 

We  do  this  by  appealing  to  Theorem  1. 

If  Theorem  I  were  perfect,  then  because  we  are  in  the  far- apart  case  only  the  correct 
strands  would  form.  The  key  is  that  as  long  as  AT  is  /-determined  one  can  prove  that  no 
other  strand  will  form  that  is  of  the  correct  length.  Then,  we  could  finish  up  the  proof  by 
using  Lemma  1  to  perform  a  length  separation. 

However,  Theorem  1  only  creates  an  approximation  to  the  test  that  contains  the  pieces 
according  to  (1)  and  (2).  Thus,  we  need  to  take  into  account  the  fact  that  c  of  the  test  tube 
is  incorrect. 

Consider  how  the  pieces  anneal  and  ligate  together  to  form  the  one  correct  long  strand 
of  the  correct  length.  If  there  are  errors  sometimes  incorrect  pieces  will  asemble.  Call  these 
“miracle  steps”.  The  point  is  that  these  occur  but  the  frequency  is  at  most  e.  Thus,  the 
expected  fraction  of  ways  to  assemble  2  correct  pieces  together  with  a  miracle  step  is  at 
most  (”)e.  In  general  with  /  miracle  steps  it  is  (”)e^  An  easy  calculation,  then  shows  that 
the  fraction  of  incorrect  strands  allowing  miracle  steps  is  at  most  0{ne).  Thus,  for  e  small 
enough  this  will  prove  the  theorem.  | 

Essentially,  the  proof  of  this  theorem  uses  the  same  method  Adleman  used  in  his  original 
paper  [1].  The  main  difference  is  that  we  do  not  place  the  pieces  explicitly  into  the  test 
tube.  Rather  they  are  generated  by  the  action  of  the  molecular  selection  step  of  Theorem 
1.  Another  key  difference  is  that  we  allow  errors. 


4.  Applications 

Once  DNA  is  re-coded  the  full  power  of  DNA  computation  can  be  used  to  solve  many 
interesting  problems.  In  particular,  we  can  now  show  that  any  reasonable  computation  can 
be  efficiently  performed  on  unknown  DNA. 

Theorem  3:  Suppose  that  X^^\  .  . . ,  X^^'^  are  unknown  strands  of  length  at  most  n  that 
are  in  distinct  test  tubes.  Moreover,  assume  that  they  are  0{\og{n))- determined.  Then,  in 
0(log(r?.))  bio-sieps  we  can  compute  the  value  of  F{X^^\  . . . ,  X^^^)  where  F()  is  an 
function. 
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Proof  of  Theorem  3:  We  will  only  sketch  the  main  ideas.  Our  plan  is  to  use  Theorem 
2  to  construct  a  test  tube  that  contains  the  information  required  by  [13]  to  compute  F. 
Think  of  the  input  as  kn  bits  and  select  a  coding  method  as  required  by  [13].  Then,  by 
Theorem  2  we  can  replace  each  of  the  test  tubes  by  an  approximation  to  one  that  contains 
as  one  long  strand.  Then,  we  cut  this  into  n  pieces:  one  for  each  of  its  bits.  We  pour 
equal  amounts  of  these  k  test  tubes  together.  Then,  we  perform  the  operations  as  in  [13]  to 
compute  the  value  of  F.  A  key  point  is  that  while  we  only  have  an  approximation  to  the 
test  tube,  [13]  is  sufficiently  robust  that  it  will  still  compute  correctly.  | 

Note,  we  needed  a  new  operation  here:  the  ability  to  take  test  tubes  T\,...^Tk  and 
create  a  new  test  tube  T  that  contains  equal  amounts  of  material  from  each  test  tube.  We 
claim  that  this  is  a  reasonable  operation  that  can  be  done  again  with  error  rate  at  most  r. 

Corollary  4:  Suppose  that  X  and  Y  are  unknown  strands  in  distinct  test  tubes.  Moreover 
assume  that  they  are  0{\og{n))-deiermined.  Then,  it  is  possible  to  check  whether  or  not 
X  =  Y  in  0(log(n))  bio-steps  where  both  are  at  most  length  n. 


5.  Conclusions 

Before  discussing  whether  or  not  these  results  are  practical,  there  is  a  generalization  that 
should  be  mentioned.  The  main  one  is  the  case  of  partially  unknown  DNA.  In  many  in¬ 
teresting  situations  the  DNA  in  a  test  tube  is  not  unknown.  Rather  we  know  that  it  is  a 
equal  to  a  known  Xq  except  for  perhaps  a  few  bases.  This  occurs  in  the  case  of  mutation 
detection,  for  example.  In  this  case  the  same  theorems  of  section  4  can  apply.  However, 
now  the  probe  set  can  be  dramatically  reduced  in  size.  The  full  details  of  this  will  be  in  the 
final  paper. 

There  are  a  number  of  issues  that  must  be  solved  before  we  can  claim  that  these  methods 
are  practical.  We  view  them  as  the  start  of  a  new  direction  for  DNA  computation.  We 
believe  that  they  should  be  viewed  as  an  “existence”  proof.  That  is  our  results  are  not 
going  directly  into  the  laboratory.  However,  the  idea  of  re-coding  unknown  DNA  and  then 
directly  computing  with  it,  DNA^DNA  computations,  are  potentially  important. 

Some  of  the  practical  issues  are  the  following: 

(1)  How  can  we  build  the  probe  sets? 

{2)  Can  we  weaken  the  assumption  on  annealing  in  the  far-apart  case? 

(,?)  Can  we  weaken  the  assumption  that  the  DNA  is  /-determined? 

Clearly,  one  cannot  use  DNA  synthesis  machines  directly  to  build  large  probe  sets.  At  least 
two  interesting  methods  seem  possible.  For  one,  we  may  be  able  to  use  the  same  technology 
that  is  used  to  create  DNA  chips.  The  micro-robotic  methods  used  to  create  these  chips 
might  be  useful  for  generating  our  probe  sets.  For  another,  we  may  be  able  to  exploit 
the  structure  of  the  probe  sets.  The  probe  sets  we  need  are  very  regular  sets.  Indeed  the 
following  seems  to  be  an  important  open  problem:  Given  a  set  of  strings  what  is  the  cost  in 
bio-steps  to  create  a  test  that  contains  only  those  strings? 

Second,  an  important  question  is  how  far  can  we  weaken  the  assumption  of  how  far- 
apart  strands  anneal  and  ligate?  One  of  the  most  important  questions  for  much  of  DNA 
computation  is  to  better  model  annealing  and  ligation.  Obviously,  the  more  realistic  we  are 
in  modelling  how  strands  mis-pair,  the  more  practical  our  results  will  be.  In  particular,  can 
we  prove  Theorem  2  in  the  case  where  incorrect  annealing/ligations  occur  but  with  a  low 
probability? 

Finally,  there  is  the  problem  of  the  assumption  we  made  that  the  DNA  is  /-determined. 
As  stated  earlier,  random  or  even  approximately  random  strings  are  /  —  0(log(n))-determined 
However,  there  are  two  problems  with  this.  First,  the  size  of  /  may  be  logarithmic  but  too 
large  for  practice.  Second,  real  DNA  is  not  random.  Can  the  methods  of  Theorem  2  be 
improved  to  handle  real  DNA?  We  are  currently  investigating  these  questions. 
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Tilings  and  Quasiperiodicity 
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Abstract.  Quasiperiodic  tilings  are  those  tilings  in  which  finite  patterns 
appear  regularly  in  the  plane.  This  property  is  a  generalization  of  the 
periodicity;  it  was  introduced  for  representing  quasicrystals  and  it  is  also 
motivated  by  the  study  of  quasiperiodic  words.  We  prove  that  if  a  tile 
set  can  tile  the  plane,  then  it  can  tile  the  plane  quasiperiodically  — a  sur¬ 
prising  result  that  does  not  hold  for  periodicity.  In  order  to  compare  the 
regularity  of  quasiperiodic  tilings,  we  introduce  and  study  a  quasiperi¬ 
odicity  function  and  prove  that  it  is  bounded  by  a;  t-f  a:  +  c  if  and  only 
if  the  considered  tiling  is  periodic.  At  last,  we  prove  that  if  a  tile  set  can 
be  used  to  form  a  quasiperiodic  tiling  which  is  not  periodic,  then  it  can 
form  an  uncountable  number  of  tilings. 


1  Introduction 

Matching  rules  in  tilings  are  local  constraints.  Thus,  tile  sets  have  been  used 
to  model  atomic  positions  in  materials  defined  by  short-range  interactions.  A 
traditional  approach  is  then  to  focus  on  the  periodicity  or  quasiperiodicity  prop¬ 
erties  of  tilings  that  can  be  formed.  This  study  has  been  revived  by  quasicrystals 
(see  [7]  for  an  overview  on  the  subject  and  pertinent  references  such  as  [9]).  A 
relation  between  the  quasiperiodicity  property  and  the  notion  of  self-similarity 
is  established  in  [5]. 

In  another  hand,  tilings  can  be  considered  as  2-dimensional  infinite  words 
with  a  local  constraint.  For  1-dimensional  structures,  an  overview  of  results 
concerning  infinite  words  can  be  found  in  [12];  bi-infinite  words  are  studied 
in  [10,  11],  and  the  problem  of  quasiperiodicity  is  strongly  related  to  the  study 
of  vSturmian  words  (see  for  instance  [14]  —  references  within). 

We  present  in  this  paper  three  main  results.  First,  a  tile  set  that  can  tile 
the  plane  can  always  be  used  to  form  a  quasiperiodic  tiling  of  the  plane.  It  is 
surprising  because  the  same  property  for  periodic  tilings  was  conjectured  by 
Wang  in  1961  (see  [15])  and  was  proved  false  by  his  student  Berger  in  1966 
(see  [1]).  Furthermore  it  has  been  proved  that  there  exists  a  tile  set  that  can  tile 
the  plane  but  although  possible  tilings  are  non-recursive. 

To  prove  this  first  result  (in  Section  3),  we  introduce  a  preorder  between 
tilings  of  the  plane  that  we  call  an  extraction  preorder.  We  show  that  quasiperi¬ 
odic  tilings  are  exactly  the  minimal  elements  of  this  preorder. 

We  introduce  also  a  function  to  measure  the  regularity  of  a  quasiperiodic 
tiling  (Section  4).  We  prove  that  a  quasiperiodic  tiling  is  periodic  if  and  only  if 
this  function  is  of  the  form  x  ^  x  c.  We  present  some  open  problems  in  this 
field. 
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Our  third  main  result  (in  Section  5)  is  that  if  a  tile  set  can  be  used  to  form  a 
strictly  quasiperiodic  tiling  of  the  plane  (i.e.  non-periodic),  then  it  can  be  used 
to  form  an  uncountable  number  of  different  tilings.  A  corollary  of  this  result 
is  that  if  a  tile  set  is  aperiodic  [i.e.  cannot  be  used  to  form  a  periodic  tiling) 
then  it  can  form  an  uncountable  number  of  different  tilings.  To  prove  this  we 
are  inspired  by  Dolbilin  in  [4]  to  introduce  a  tree  representation  for  tile  sets. 

In  our  last  Section,  we  present  a  topological  approach  of  tilings.  This  ap¬ 
proach  allows  us  to  give  another  point  of  view  on  our  results  of  Section  3.  We 
have  not  proved  yet  any  new  result  using  topology  but  we  think  that  this  ap¬ 
proach  may  be  fruitful. 

Due  to  the  page  limit,  some  proofs  are  ommited. 


2  Preliminaries 

A  tile  is  a  square  with  color  sides.  Colors  belong  to  a  finite  set  C.  A  tile  set 
T  is  a  subset  of  All  tiles  have  the  same  (unit)  size.  A  configuration  is  a 
mapping  from  the  plane  into  the  tile  set.  We  call  pattern  a  partial  function 
of  finite  domain  from  into  the  tile  set.  We  say  that  a  pattern  appears  in  a 
configuration,  if  the  configuration  is  an  extension  of  the  image  of  this  pattern 
by  a  shift.  A  tiling  of  the  plane  is  a  configuration  in  which  all  pairs  of  adjacent 
sides  have  the  same  color.  Notice  that  it  is  not  allowed  to  turn  tiles. 

The  tiling  problem  consists  of  a  tile  set  as  input,  and  the  question  is  whether 
it  can  be  used  to  tile  the  plane.  It  was  formulated  by  Wang  in  1961  [15]  for 
some  logical  purposes:  a  tile  set  can  be  reduced  into  some  formula  such  that 
the  formula  is  satisfiable  if  and  only  if  the  tile  set  can  tile  the  plane.  This  tiling 
problem  was  conjectured  decidable  but  was  proved  undecidable  by  Berger  [1]  in 
1966;  a  simplified  proof  was  given  in  1971  by  Robinson  [13]  (see  also  [2]  for  the 
consequences  in  logics  —Hilbert’s  well-known  Entscheidungsproblem) . 

A  periodic  configuration  is  formed  by  the  juxtaposition  of  copies  of  the  same 
rectangle.  In  other  terms  a  periodic  configuration  should  be  periodic  with  respect 
to  both  axes.  Thus,  a  periodic  tiling  is  a  periodic  configuration  which  is  also  a 
tiling.  This  definition  is  justified  by  the  following  result  of  Wang:  if  a  tile  set  can 
form  a  tiling  which  is  periodic  in  only  one  direction,  then  it  can  form  a  tiling 
which  is  periodic.  This  property  was  one  of  the  reasons  why  Wang  conjectured 
that  the  tiling  problem  was  decidable.  The  other  reason  was  that  he  did  not 
know  any  aperiodic  tile  sets,  i.e.  tile  sets  that  can  tile  the  plane  but  cannot  form 
any  periodic  tiling.  If  such  aperiodic  tile  sets  did  not  exist,  then  one  could  decide 
the  tiling  problem  by  the  following  algorithm:  try  to  tile  a  square  of  size  n\  if  you 
cannot,  then  halt  and  answer  “no”,  else  if  you  can  tile  the  square  periodically, 
then  halt  and  answer  “yes”,  else  add  1  to  n  and  restart  the  same  process.  This 
algorithm  does  not  halt  if  and  only  if  the  considered  tile  set  is  aperiodic.  In  the 
proof  of  Berger’s  theorem  an  aperiodic  tile  set  is  constructed  with  more  than 
20000  tiles,  and  in  Robinson’s  simplified  proof  an  aperiodic  tile  set  containing 
approximatively  50  tiles  is  constructed.  The  smallest  known  aperiodic  tile  set 
contains  13  tiles  and  is  due  to  Culik  and  Kari  ([3,  8]). 
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The  periodic  tiling  problem,  consists  of  a  tile  set  as  input,  and  the  question 
is  whether  it  can  be  used  to  tile  periodically  the  plane.  It  has  been  proved 
undecidable  by  Gurevich  and  his  student  Koriakov  in  1972  [6].  They  furthermore 
proved  that  you  cannot  recursively  separate  tile  sets  that  cannot  tile  the  plane 
from  tile  sets  that  can  tile  the  plane  periodically.  This  result  is  very  important 
because  in  the  previously  mentioned  reduction,  tile  sets  that  can  tile  the  plane 
periodically  correspond  to  formula  having  a  finite  model.  Such  reductions  are 
called  conservative  in  [2].  This  book  contains  an  appendix  by  ourself  devoted  to 
the  proof  of  all  these  undecidability  results. 

It  is  often  convenient  to  use  other  notions  of  tiles  sets  that  differ  slightly  from 
above; 


--  one  can  use  arrows  on  tiles;  a  tiling  is  considered  as  valid  if  and  only  if  all 
pairs  of  adjacent  sides  have  the  same  color,  and  if,  for  each  arrow  of  the 
plane,  its  head  points  out  on  the  tail  of  an  arrow  in  the  adjacent  cell; 

-  one  can  replace  squares  by  polygons  of  the  plane  and  ask  that  two  adjacent 
polygons  neither  overlap  nor  create  holes; 

-  one  can  put  a  color  not  only  on  the  sides  of  the  squares  put  also  on  their 
corners;  four  corners  in  contact  should  have  the  same  color; 

-  one  could  just  assign  a  state  (out  of  a  finite  set)  to  each  considered  cell  and 
fix  a  neighborhood.  The  matching  condition  is  replaced  by  a  relation  between 
states  that  should  be  verified  in  the  neighborhood  of  each  cell. 

It  is  folklore  that  all  these  notions  are  equivalent:  there  exist  transformations 
of  tile  set  from  one  notion  into  another  that  preserve  existence  of  valid  tilings, 
periodicity  or  non-periodicity,  quasiperiodicity,  etc. 

We  could  have  considered  tilings  of  the  continuous  plane  by  polygons  such 
as  in  the  well  known  Penrose  tilings.  This  notion  of  tilings  is  not  equivalent  to 
Wang  tiles  because  the  centers  of  the  considered  polygons  may  not  have  rational 
coordinates.  Anyways,  for  these  tilings,  our  theorems  6  and  13  are  still  valid  if 
one  consider  that  two  tilings  (or  patterns)  are  equal  if  they  can  be  superim¬ 
posed  using  translations  and  rotations.  The  needed  changes  in  the  proofs  are 
straightforward.  Our  study  of  the  regularity  of  quasiperiodic  tilings  (Section  4) 
is  slightly  changed  in  this  case:  the  size  on  a  pattern  —  and  thus  quasiperiodicity 
functions  —  should  be  define  up  to  a  multiplicative  constant. 


3  Extraction  and  quasiperiodicity 

Before  defining  quasiperiodic  tilings,  we  introduce  a  partial  preorder  relation  be¬ 
tween  configurations.  We  call  this  preorder  the  “extraction”  preorder  and  prove 
that  it  has  good  properties  with  respect  to  the  notion  of  tilings  and  — later — 
those  notions  of  periodicity  and  quasiperiodicity. 
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3.1  Extraction 

Definition  1.  Let  us  consider  two  configurations  Ci  and  C2.  We  say  that  ci  is 
extracted  from  C2  if  and  only  if  any  pattern  that  appears  in  Ci  also  appears  in 
C2-  We  denote  this  relation  by  ci  C2. 

Note  that  if  ci  ^  co,  and  if  C2  is  a  tiling  of  the  plane,  then  ci  is  also  a  tiling 
of  the  plane.  In  other  terms,  a  configuration  which  is  extracted  from  a  tiling  is 
also  a  tiling:  if  Ci  had  a  defect,  then  this  defect  should  also  appear  in  C2. 

Let  us  now  define  what  can  be  called  a  diagonal  extraction  process.  We  use 
it  in  order  to  prove  the  following  proposition. 

Proposition  2.  Assume  that  a  sequence  of  patterns  is  given,  that  their 

domains  increase  (dom{Mi)  C  dom{Mi^i)),  and  that  they  cover  the  whole  plane 
dom(Mi)  =  Z^).  Then  there  exist  a  configuration  d  such  that  any  pattern 

i£li 

that  appears  zn  d  also  appears  in  an  infinite  number  of  Mi ’s.  If  all  the  Mi 's  have 
been  chosen  zn  a  configuration  c,  then  the  obtained  configuration  d  is  extracted 
from  c  (d  -<  c).  Furthermore,  if  c  zs  a  tiling,  then  d  is  also  a  tiling. 

Note  that  this  diagonal  extraction  process  is  not  effective;  it  is  not  an  algo¬ 
rithmic  procedure. 


3.2  Quasiperiodicity 

Definition 3.  A  quasiperiodic  configuration  is  a  configuration  c  with  the  follow¬ 
ing  property:  for  all  pattern  M  that  appears  in  c,  there  exists  an  integer  n  such 
that  M  appears  in  all  n  x  n  squares  in  c. 

A  periodic  configuration  is  also  quasiperiodic.  We  call  strictly  quasipez'iodic 
those  configurations  that  are  quasiperiodic  but  not  periodic.  The  quasiperiodicity 
is  a  regularity  property:  a  patterns  that  appears  somewhere  in  a  quasiperiodic 
configuration  must  appear  regularly. 


Fig.  1.  Quasiperiodic  configurations 


If  a  tiling  (resp.  a  configuration)  is  not  quasiperiodic,  then  there  exists  a 
pattern  in  this  tiling  that  can  be  associated  to  an  infinite  number  of  growing 
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squares  that  belong  to  the  tiling,  and  in  which  the  pattern  does  not  appear.  In 
the  sequel,  we  call  such  a  pattern  critical  for  the  considered  tiling  (resp.  for  the 
configuration). 

Lemma  4.  If  pattern  M  ts  critical  for  a  configuration  c,  then  there  exist  at  least 
one  configuration  cm  extracted  from,  c  in  which  the  pattern  M  does  not  appear. 

Proof  Consider  an  infinite  sequence  of  patterns  in  where  M  does  not  appear. 
Then  make  a  diagonal  extraction  process  to  obtain  a  new  configuration  in  which 
M  does  not  appear.  As  this  configuration  is  obtained  by  a  diagonal  extraction, 
then  it  is  extracted  from  c. 

Note  that  M  is  not  critical  in  cm  since  it  does  not  appear  in  it. 

Propositions.  Quasiperiodic  tilings  (resp.  configurations)  are  exactly  the  min¬ 
imal  elements  for  the  extraction  preorder.  More  formally,  c  is  quasiperiodic  if 
and  only  ifMd  d  ^  c  =>  c  ^  d. 

Proof.  Consider  a  quasiperiodic  configuration  c.  Assume  that  d  ^  c;  let  us  prove 
that  c  d  i.e.  that  any  pattern  of  c  can  be  found  in  d.  Let  us  consider  a 
pattern  M  in  c;  it  can  be  found  in  all  sufficiently  large  squares  of  c  because  of 
the  quasiperiodicity  hypothesis.  Let  us  consider  a  square  of  the  same  size  in  d. 
As  d  -<  c,  it  appears  somewhere  in  c  and  thus  contains  M.  Hence  M  appears  in 
d.  The  converse  is  straightforward  using  Lemma  4. 

Theorem  6.  If  a  tile  set  admits  a  tiling,  then  it  admits  a  quasiperiodic  tiling. 

Before  proving  this  theorem,  we  need  to  explain  why  the  quasiperiodicity 
property  is  compatible  with  the  extraction  preorder. 

Lemma  7.  If  a  pattern  M  is  critical  for  a  configuration  c,  and  if  c  is  extracted 
from  a  configuration  d,  then  M  is  also  critical  for  d. 

Proof  Consider  d  such  that  c  ^  d.  If  M  is  critical  for  c,  then  it  appears  in  c  thus 
in  d.  Furthermore,  the  infinite  family  of  rectangles  in  which  M  does  not  appear 
can  l^e  found  in  c  hence  in  d.  Thus  M  is  critical  for  d. 

Proof  of  Theorem  6.  Remember  that  a  tile  set  is  given,  that  it  can  be  used  to 
tile  the  plane,  and  that  our  goal  is  to  prove  that  one  can  form  a  quasiperiodic 
tiling  of  the  plane  with  it. 

Let  t  be  a  tiling  of  the  plane  using  this  tile  set.  Assume  that  it  is  not  quasiperi¬ 
odic.  It  contains  some  critical  patterns.  Among  them,  we  consider  the  smallest 
pattern  Mi:  it  is  not  difficult  to  define  a  total  ordering  of  patterns;  first  order 
them  by  the  size  of  their  domain  (more  precisely  by  the  size  of  the  smallest 
square  that  contains  their  domain)  and  then  by  alphabetic  order.  Also  note  that 
the  set  of  all  patterns  is  countable.  Using  Lemma  4,  we  can  construct  a  tiling 
I  Ml  -<  I  in  which  Mi  does  not  appear.  Because  of  Lemma  7,  all  critical  patterns 
of  i-Mi  (if  any)  are  also  critical  for  t. 
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If /mi  is  not  quasiperiodic,  then  we  repeat  this  process:  we  choose  the  smallest 
critical  pattern  M2  in  ^nd  obtain  tM^  -< 

If  after  a  finite  number  of  steps  of  this  process,  we  obtain  a  quasiperiodic 
tiling,  then  the  theorem  is  proved.  Else,  we  obtain  an  infinite  sequence  of  tilings 
(/mJ  such  that  . .  -<  ^  E  Let  us  consider  now  a  1  x  1-square  in 

tMr ,  a  2  X  2-square  in  tM^ ,  a  A:  x  A:-square  in  Imh  •  •  •  With  a  diagonal  extraction 
process,  we  obtain  a  tiling  d  which  is  extracted  from  all  the  ^m^’s:  d  A 

...i.M,  ■<  t.  If  this  tiling  had  a  critical  pattern,  then  this  pattern  should  be 
critical  for  all  the  But  in  the  pattern  ordering,  one  of  the  patterns  is 

greater  than  this  critical  patterns  which  contradicts  our  choice  of  the  smallest 
possible  pattern.  Hence  d  is  quasiperiodic  and  d  A  c. 

Note  that  this  proof  is  not  constructive  and  uses  the  axiom  of  choice. 


4  Quasiperiodicity  functions 

In  this  section,  we  introduce  a  quasiperiodicity  function  in  order  to  measure  the 
regularity  of  a  quasiperiodic  tiling. 

Let  us  consider  a  quasiperiodic  configuration.  Coming  back  to  the  definition 
of  quasiperiodicity  (Definition  3  and  Figure  1),  it  is  natural  to  consider  the  func¬ 
tion  that  maps  a  pattern  to  the  smallest  integer  n  such  that  the  pattern  appears 
in  all  squares  of  size  n  x  n.  This  function  is  not  defined  on  those  patterns  that 
do  not  appear  in  the  tiling.  Since  in  the  sequel  we  are  only  interested  in  upper 
bounds,  we  can  restrict  this  function  to  square  patterns  — other  patterns  can  be 
included  in  larger  squares.  Thus  we  can  consider  the  maximum  of  this  function 
on  all  patterns  of  size  x:  we  map  x  to  the  minimal  size  of  squares  n  in  which 
one  can  find  all  those  patterns  of  size  x  that  appear  in  the  tiling.  We  call  it  the 
quasiperiodicity  function  of  the  tiling. 

Intuitively,  if  this  function  grows  slowly  to  infinity,  then  the  quasiperiodic 
tiling  is  rather  regular,  but  if  it  grows  fast,  then  the  regularity  is  weak.  Using 
this  function  we  can  characterize  which  quasiperiodic  tilings  are  periodic: 

Theorems.  A  quasiperiodic  tiling  is  periodic  if  and  only  if  its  quasiperiodicity 
function  is  bounded  by  x  x  c  where  c  is  a  constant. 

Proof.  Let  us  consider  a  periodic  tiling  of  period  a.  It  is  not  difficult  to  prove 
that  its  quasiperiodicity  function  is  bounded  by  x  a:  4-  a.  Such  a  situation  is 
illustrated  by  Figure  2. 

Let  us  consider  now  a  quasiperiodic  tiling  of  function  bounded  hy  x  x  P  c. 
Let  us  consider  a  pattern  Pi  of  size  xi  much  larger  that  c.  Let  us  consider  a 
window  of  size  xi  -f  c  such  that  its  left  border  is  just  1  cell  to  the  right  of  the 
left  border  of  P\  (see  Figure  3).  A  copy  of  Pi  must  appear  in  this  window  and 
overlaps  Pi.  Note  that  there  are  at  most  possible  positions  for  this  copy  —  it 
is  essential  in  the  rest  of  the  proof. 

Now  let  us  consider  a  pattern  P2  of  size  X2  >  xi  containing  Pi.  Let  us 
consider  a  window  of  size  X2  +  C  such  that  its  left  border  is  just  1  cell  to  the  right 
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Fig.  2.  The  quasiperiodicity  function  of  a  periodic  tiling 
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Fig.  3.  The  converse 


of  the  left  border  of  P2.  We  find  another  copy  of  P2  and  there  are  at  most 
possible  translations  from  P2  to  this  copy.  Note  that  such  a  translation  is  also 
valid  for  Pi  since  Pi  is  embedded  in  P2. 

By  iteration,  we  prove  that  there  exist  a  common  translation  vector  for  all 
the  (POieN-  Thus  the  tiling  is  periodic  in  at  least  one  dimension.  We  use  the 
same  reasoning  to  find  another  periodicity  vector:  the  difference  is  that  instead 
of  shifting  the  window  to  the  right,  we  shift  it  in  a  direction  which  is  orthogonal 
to  the  first  periodicity  vector.  Note  that  this  vector  may  not  point  exactly  to  the 
right:  we  can  just  say  that  it  more  or  less  points  to  the  right. 

Quasiperiodicity  functions  of  the  form  x  ^  cx  are  not  difficult  to  obtain.  An 
example  is  any  of  the  quasiperiodic  tilings  that  can  be  formed  using  Robinson’s 
aperiodic  tile  set  (see  [13]  or  [2]  for  the  definition).  Furthermore  all  these  tilings 
have  exactly  the  same  multiplicative  constant.  For  Penrose  tilings,  the  study  is  a 
little  more  complicated  since  Penrose  tiles  cannot  be  placed  on  the  vertices  of  1?. 
There  are  several  ways  to  measure  sizes  of  patterns:  we  can  consider  the  distance 
in  or  the  number  of  tiles  included  in  it.  These  distances  lead  to  different 
quasiperiodicity  functions  but  each  of  them  is  bounded  by  a  multiplicative  non¬ 
zero  constant  times  the  other  one.  Anyway,  all  quasiperiodicity  functions  of 
quasiperiodic  Penrose  tilings  are  of  the  form  x  ^  cx. 

Some  important  questions  are  still  open:  what  are  all  quasiperiodicity  func¬ 
tions  that  can  be  observed  in  quasiperiodic  tilings?  By  “observed”  we  mean  that 
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all  quasiperiodic  tilings  that  can  be  formed  with  the  considered  tile  set  should  be 
of  the  desired  form.  Otherwise  it  is  not  difficult  to  construct  such  tilings  with  a 
trivial  tile  set.  Our  last  open  problems  are  the  following:  is  it  possible  to  observe 
non-recursive  quasiperiodicity  functions?  If  a  quasiperiodicity  function  is  non¬ 
recursive  and  grows  faster  than  any  recursive  function  then  the  quasiperiodic 
tiling  is  regular  but  this  regularity  cannot  be  measured. . . 


5  Counting 

In  this  section,  we  introduce  two  structures:  trees  associated  to  tile  sets,  and 
trees  associated  to  tilings  of  the  plane;  we  are  inspired  by  [4]  to  introduce  them. 
We  then  combine  these  structures  with  the  quasiperiodicity  notion  in  order  to 
prove  the  main  result  of  this  section  (Theorem  13). 

5.1  Ti'ees 

In  the  rest  of  this  section,  we  only  consider  valid  square  patterns  (the  matching 
condition  for  the  edges  of  the  tiles  is  true  inside  the  pattern) .  We  call  77-pattern 
any  2n  x  2n  square  pattern  and  we  say  that  a  (774- l)-pattern  extends  a  77.-pattern 
if  the  77-pattern  is  the  center  of  the  (77  -}-  l)-pattern.  In  other  words,  the  (77  +  1)- 
pattern  is  obtained  from  the  77-pattern  by  putting  tiles  around  its  border.  Note 
that  it  is  not  always  possible  to  do  this  because  the  matching  condition  must 
be  true  in  this  new  border.  A  unique  0-pattern  exists  for  all  tile  sets:  it  is  the 
pattern  with  the  empty  domain. 

Definition 9.  The  tree  associated  to  a  tile  set  r  is  the  tree  Ar  such  that  the 
vertices  of  At  are  77-patterns  formed  using  the  tiles  of  r;  the  root  is  the  0- 
pattern;  the  children  of  a  77-pattern  node  are  those  (77-}-  l)-patterns  that  extend 
the  77- pattern. 


The  tree  such  defined  can  be  finite  or  infinite.  All  nodes  are  of  finite  degree  but 
these  degrees  may  not  be  bounded.  Note  that  an  infinite  path  in  At  corresponds 
to  a  tiling  of  the  plane  with  r.  Conversely,  let  us  consider  a  tiling  of  the  plane 
with  r,  and  all  77-patterns  centered  in  the  cell  (0,0).  These  patterns  correspond 
to  an  infinite  path  in  Ar  ■ 

If  the  height  of  Ar  is  not  bounded,  using  Konig’s  infinity  lemma  one  can 
claim  that  it  contains  an  infinite  path  and  thus  that  it  is  possible  to  tile  the 
plane  (see  also  Proposition  2). 

Definition  10.  Let  c  be  a  valid  tiling  for  tile  set  r.  The  tree  associated  to  the 
tiling  c  is  the  tree  Ac  such  that  Ac  is  the  restriction  of  Ar  containing  all  77- 
patterns  of  At  that  can  be  found  in  c. 

All  branches  of  the  tree  Ac  are  infinite  since  a  pattern  that  appears  some¬ 
where  in  c  can  always  be  extended.  Any  infinite  path  of  Ac  corresponds  to  a 
tiling  that  can  be  extracted  from  c.  Thus  we  obtain  the  following  proposition: 
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Proposition  11.  Let  c  and  d  be  two  tilings  of  the  plane  with  r.  Ac  C  Ad  if  and 
only  if  c  ■<  d.  If  c  is  quasiperiodic,  and  if  there  exist  d  such  that  Ad  C  Ac,  then 
Ad  ~  Ac  ■ 

Another  interpretation  of  the  previous  proposition  is  the  following:  one  can 
restrict  Ac  into  some  Ad  if  and  only  if  c  is  not  quasiperiodic. 


5.2  Periodicity  and  quasiperiodicity 

Let  us  now  explain  the  difference  between  the  tree  associated  to  a  periodic  tiling 
and  the  tree  associated  to  a  strictly  quasiperiodic  one.  Let  us  call  chain  of  a 
tree  an  infinite  path  in  the  tree  in  which  every  node  has  exactly  one  child;  the 
starting  node  of  the  chain  is  a  node  of  the  tree  (usually  not  the  root). 

Proposition  12.  If  c  is  quasiperiodic  and  if  Ac  contains  a  chain,  then  c  is 
periodic. 

Proof  Consider  the  starting  node  of  the  chain  — more  precisely,  the  pattern  M 
that  is  associated  to  it.  There  is  no  branching  on  this  node  and  below  hence  if  the 
pattern  M  appears  in  c  centered  in  (0,0)  and  in  {i,j),  then  (i,j)  is  a  periodicity 
vector  of  c.  As  c  is  quasiperiodic,  the  pattern  c  appears  in  all  sufficiently  large 
regions  of  c.  Hence  we  can  find  2  periodicity  vectors  for  c  of  different  directions; 
c  is  periodic. 

Now  we  can  present  our  main  theorem.  Its  proof  is  easy  with  the  help  of  the 
previous  properties. 

Theorem  13.  If  a  tile  set  can  be  used  to  form  a  strictly  quasiperiodic  tiling  of 
the  plane,  then  it  can  form  an  uncountable  number  of  different  tilings. 

First  remark  that  this  result  is  unchanged  if  we  consider  that  two  tilings  that 
can  be  superimposed  are  equal.  In  this  case,  one  can  transform  one  of  the  tilings 
into  the  other  by  a  translation.  The  set  of  translations  is  countable  hence  the 
theorem  is  still  valid. 

Proof.  Let  c  be  a  strictly  quasiperiodic  tiling  of  the  plane.  Ac  does  not  contain 
any  chain  otherwise  c  would  be  periodic  (Proposition  11  et  12).  Thus  Ac  contains 
an  uncountable  number  of  infinite  paths.  We  can  associate  to  each  of  these  paths 
a  tiling  if  we  consider  that  all  the  patterns  of  the  path  are  centered  in  the  origin. 
Two  different  paths  are  associated  to  two  different  tilings  thus  the  number  of 
different  tilings  that  can  be  formed  is  not  countable. 

Note  that  the  uncountable  set  of  tilings  that  is  obtained  in  this  proof  consist 
of  quasiperiodic  tilings  that  can  be  mutually  extracted;  all  these  tilings  can  be 
obtained  from  c  by  extraction. 

A  corollary  of  this  result  is  that  one  cannot  separate  quasiperiodic  tilings  with 
any  computing  device  (computing  devices  usually  belong  to  countable  sets). 
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6  Topology 

We  present  in  this  section  another  approach  to  tiling  problems.  This  approach 
is  based  on  the  topological  properties  of  the  set  of  configurations. 

Let  us  endow  a  tile  set  r  with  the  discrete  topology  for  which  all  subsets  are 
open.  A  configuration  is  a  mapping  of  the  plane  into  the  tile  set.  Thus,  the 
set  of  all  configurations  is  a  countable  product  of  sets  that  we  endow  with 
the  product  topology:  an  open  subset  of  is  a  union  of  finite  intersections  of 
sets  of  the  form  ^  =  {c  G  ,  c(i)  =  a}. 

In  this  topological  approach,  the  notion  of  patterns  is  very  natural  since  they 
correspond  with  basic  open  sets.  More  precisely,  we  can  define  a  basic  open  set 
a.ssociated  to  a  pattern  as  the  set  of  all  configurations  equal  to  the  pattern  on 
its  domain;  ,  ^ 

=  |c  G  ,  c  |dom.in(p)  -  PJ  ■ 

Note  that  Op's  (and  Oi^s  which  are  special  Op's)  are  both  open  and  closed: 
their  complements  are  finite  union  of  the  Op^  where  domain (p)  -  domain (p') 
and  p  ^  p^  Any  open  set  U  can  be  written  as  a  union  of  basic  open  sets: 

W=  U  Op. 

p  pattern 

Proposition  14.  is  a  compact  metric  space. 

We  shall  use  very  often  in  the  rest  of  this  section  the  compactness  of  and 
more  precisely  the  compactness  of  the  set  of  tilings  that  can  be  formed  using 
T.  Let  us  denote  by  Tj  this  particular  subset  of  configurations  (which  can  be 
empty). 

Proposition  15.  Let  r  be  a  tile  set.  The  subset  Tr  of  consisting  of  tilings 
of  the  plane  by  r  is  compact. 

Furthermore,  our  process  of  diagonal  extraction  (Proposition  2)  can  be  seen 
as  a  consequence  of  the  compactness  and  of  the  shift  invariance  of  Tr . 

Now  let  us  interpret  our  relation  of  extraction  (see  Definition  1)  in  topological 
terms.  To  do  that,  let  us  consider  the  horizontal  and  vertical  shifts  ah  and  a^. 
Let  us  define  r(c)  as  the  topological  closure  of  the  set  of  all  images  of  c  by  any 
shift.  It  is  natural  to  construct  such  a  set  since  we  tend  to  consider  that  two 
configurations  that  can  be  superimposed  are  the  same.  In  the  following  formal 
definition,  the  topological  closure  is  denoted  by  an  over  line: 

r(c)  =  U  WhOcriic}}. 

Proposition  16.  Our  relation  of  extraction  corresponds  exactly  to  the  inclusion 
of  our  sets  r(c).  More  formally  the  following  properties  are  equivalent: 

(a)  Cl  C2, 

(b)  Cl  G  r(c2), 

(c)  r(ci)cr(c^). 
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Note  that  if  r(ci)  C  /"(co)  and  if  C2  is  a  tiling  then  ci  is  also  a  tiling.  It  can 
be  interpreted  as  a  rnonotonicity  property  of  tilings. 

Let  us  come  back  now  to  the  quasiperiodicity.  We  obtain  from  Section  3.2 
that  q  is  quasiperiodic  if  and  only  if  r{q)  is  minimal  for  the  inclusion  relation 
among  all  /^(cj’s.  Let  us  come  back  to  our  Theorem  6:  “if  a  tile  set  can  tile 
the  plane,  then  it  can  be  used  to  form  a  quasiperiodic  tiling  of  the  plane”.  In 
our  context,  it  corresponds  to  the  existence  of  a  minimal  r{q)  among  all  T'(c)’s 
corresponding  to  tiling.  Assume  that  it  is  possible  to  tile  the  plane;  then  using 
the  monotonicity  property  of  tilings  and  Zorn’s  lemma,  we  obtain  the  existence 
of  a  quasiperiodic  tiling. 

We  do  not  know  how  to  prove  our  combinatorial  theorem  (Theorem  13  of 
Section  5),  or  to  interpret  quasiperiodicity  functions  of  Section  4  using  only  topo¬ 
logical  arguments. 
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Abstract.  We  prove  that  any  IN-rational  sequence  s  =  (sTi)n>i  of  non¬ 
negative  integers  satisfying  the  Kraft  strict  inequality  X]n>i  <  1 

is  the  enumerative  sequence  of  leaves  by  height  of  a  rational  k-ary  tree. 
Particular  cases  of  this  result  had  been  previously  proven.  We  give  some 
partial  results  in  the  equality  case. 


1  Introduction 

This  paper  is  a  study  of  problems  linked  with  coding  and  symbolic  dynamics. 
The  results  can  be  considered  as  an  extension  of  the  old  results  of  Huffman, 
Kraft,  McMillan  and  Shannon  on  source  coding.  We  actually  prove  results  on 
rational  sequences  of  integers  that  can  be  realized  as  the  enumerative  sequence 
of  leaves  in  a  rational  tree. 

Let  s  be  an  IN-rational  sequence  of  nonnegative  numbers,  that  is  a  sequence 
s  =  (sn)r7  >i  such  that  Sn  is  the  number  of  paths  of  length  n  going  from  an  initial 
state  to  a  final  state  in  a  finite  multigraph  or  a  finite  automaton.  We  say  that  s 
satisfies  the  Kraft  inequality  for  a  positive  integer  k  if  J2n>i  <  1- 

A  rational  tree  is  a  tree  which  has  only  a  finite  number  of  non-isomorphic 
subtrees.  If  s  is  the  enumerative  sequence  of  leaves  of  a  rational  ^-ary  tree,  then 
s  satisfies  Kraft’s  inequality  for  the  integer  k. 

In  this  paper,  we  study  the  converse  of  the  above  property.  Consider  for 
example  the  series  5(2:)  =  We  have  s{l/2)  =  1  and  we  can  obtain  s  as 

the  enumerative  sequence  of  the  tree  of  the  figure  below  associated  with  the 
prefix  code  A  =  (aa)*{ab  +  ba  +  66)  on  the  binary  alphabet  {a,  6}.  We  dont 
know  however  if  the  same  can  be  done  for  the  series  5(2)  =  +  jrlja)- 


Fig.  1.  Tree  associated  to  82^(2:^)* 

Known  constructions  allow  one  to  obtain  a  sequence  s  satisfying  Kraft’s  in¬ 
equality  as  the  enumerative  sequence  of  leaves  of  a  k-ary  tree,  or  as  the  enumer¬ 
ative  sequence  of  leaves  of  a  (perhaps  not  k-ary)  rational  tree.  These  two  con¬ 
structions  lead  in  a  natural  way  to  the  problem  of  building  a  tree  both  rational 
and  k-ary.  This  question  was  already  considered  in  [9],  where  it  was  conjectured 
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that  any  IN-rational  sequence  satisfying  Kraft’s  inequality  is  the  enumerative 
sequence  of  leaves  of  a  A?- ary  rational  tree. 

In  this  paper,  we  prove  this  conjecture  in  the  case  where  the  sequence  satisfies 
Kraft’s  inequality  with  a  strict  inequality,  and  we  give  some  partial  results  in 
the  equality  case.  For  example,  we  state  the  following  weaker  property  for  such 
a  sequence:  If  s  is  an  IN-rational  sequence  of  nonnegative  numbers  satisfying 
Kraft’s  equality,  then  there  is  a  positive  integer  m  such  that  ms  = 
where  each  Vi  is  the  enumerative  sequence  of  the  leaves  of  a  A:-ary  rationaftree. 

Proofs  and  algorithms  used  to  establish  the  results  are  based  on  automata 
theory  and  symbolic  dynamics.  In  particular,  we  use  the  state  splitting  algorithm 
which  has  been  introduced  by  R.  Adler,  D.  Coppersmith  and  M.  Hassner  in  [1] 
to  solve  coding  problems  for  constrained  channels  by  constructing  finite-state 
codes  with  sliding  block  decoders.  This  was  partly  based  on  earlier  work  of  B. 
Marcus  in  [7]. 

A  variant  of  the  problem  considered  here  consists  in  replacing  the  enumera¬ 
tive  sequence  of  leaves  by  the  enumerative  sequence  of  all  nodes.  Soittola  ([11]) 
has  characterized  the  series  which  are  the  enumerative  sequence  of  nodes  in  a 
rational  tree.  The  problem  of  a  similar  characterization  for  rational  A;-ary  trees 
remains  open  in  the  general  case. 

In  [9],  a  particular  case  is  treated.  It  allows  to  solve  the  problem  for  the  enu¬ 
merative  sequence  of  leaves  in  the  equality  case  under  the  additional  assumption 
of  a  unique  pole  of  minimal  modulus. 

The  paper  is  organized  as  follows.  We  first  give  basic  definitions  and  prop¬ 
erties  of  rational  objects,  sequences  and  trees.  We  then  give  some  definitions 
coming  from  the  theory  of  symbolic  dynamics.  We  define  the  notions  of  state 
splitting,  approximate  eigenvector  and  recall  the  algorithm  of  [1].  In  section  3, 
we  establish  the  announced  results  and  give  examples  for  the  constructions. 

2  Definitions  and  background 

2.1  Rational  sequences  of  nonnegative  numbers 

We  denote  by  G  a  directed  graph  with  E  as  its  set  of  edges.  We  actually  use 
multigraphs  instead  of  ordinary  graphs  in  order  to  be  able  to  have  several  distinct 
edges  with  the  same  origin  and  end.  Formally  a  multigraph  is  given  by  two  sets 
E  (the  edges)  and  V  (the  vertices)  and  two  functions  from  E  to  V  which  define 
the  origin  and  the  end  of  an  edge.  An  edge  in  a  multigraph  going  from  p  to  q 
will  be  noted  (p,  x,  q)  where  a:  G  IN.  This  is  equivalent  to  number  the  edges  going 
from  p  to  g  in  order  to  distinguish  them.  We  shall  always  say  “graph”  instead 
of  “multigraph” . 

In  this  paper,  we  consider  sequences  of  nonnegative  numbers.  Such  a  sequence 
s  —  (sn)n>o  will  be  said  to  be  -rational  if  Sn  is  the  number  of  paths  of  length 
n  going  from  a  state  in  /  to  a  state  in  F  in  a  finite  directed  graph  G,  where  / 
and  F  are  two  special  subsets  of  states,  the  initial  and  final  states  respectively. 
We  say  that  the  triple  (G,  /,  F)  is  a  representation  of  the  sequence  s. 
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This  definition  is  usually  given  for  the  series  instead  of  the  se¬ 

quence  s.  Any  IN-rational  sequence  5  satisfies  a  recurrence  relation  with  integer 
coefficients.  However,  it  is  not  true  that  a  sequence  of  nonnegative  integers  sat- 
isfving  a  linear  recurrence  relation  is  IN-rational.  An  example  can  be  found  in 
[5]  p.  93. 

A  well  known  result  in  automata  theory  allows  us  to  use  a  particular  repre¬ 
sentation  of  an  IM-rational  sequence  s.  One  can  choose  a  representation  {G,  i,  F) 
of  s  with  a  unique  initial  state  i  and  such  that  : 

no  edge  is  coming  in  state  i 
-  no  edge  is  going  out  of  any  state  of  F. 

Such  a  representation  is  called  a  normalized  representation.  Moreover,  it  is  pos¬ 
sible  to  reduce  to  one  state  the  set  of  final  states  (see  for  example  [10]  p.  14). 

We  now  give  some  basic  definitions  about  trees.  A  tree  T  on  a  set  of  nodes 
N  with  a  root  r  is  a  function  T  :  N  -  {r}  — N  which  associates  to  each  node 
distinct  from  the  root  its  father  T{n)  in  such  a  way  that,  for  each  node  n,  there 
is  a  nonnegative  integer  h  such  that  T^{n)  —  r.  The  integer  h  is  the  height  of  the 
node  n.  A  tree  is  Ar-ary  if  each  node  has  at  most  k  sons.  A  leaf  is  a  node  without 
son.  We  denote  by  l(T)  the  enumerative  sequence  of  its  leaves  by  height,  that  is 
the  sequence  of  numbers  where  Sn  is  the  number  of  leaves  of  T  at  height  n. 
A  tree  is  said  to  be  rational  if  it  admits  only  a  finite  number  of  non  isomorphic 
subtrees.  If  T  is  a  rational  tree,  the  sequence  1{T)  is  an  IN-rational  sequence. 

The  sequence  s  —  /(T)  of  a  Ar-ary  tree  is  the  length  distribution  of  a  prefix 
code  over  a  A;-letter  alphabet.  The  associate  series  5(2)  =  satisfies 

then  Kraft’s  inequality  :  s{l/k)  <  1.  We  shall  say  that  Kraft’s  strict  inequality 
is  satisfied  when  s(l/k)  <  1.  The  equality  is  reached  when  each  node  of  the  tree 
has  exactly  zero  or  k  sons.  Conversely,  the  McMillan  construction  establishes 
that  for  any  series  s  satisfying  Kraft’s  inequality,  there  is  a  k-a.Ty  tree  such  that 
.s  =  l(T).  Moreover,  if  the  series  satisfies  Kraft’s  equality,  then  the  internal  nodes 
will  have  exactly  k  sons.  But  the  tree  obtained  is  not  rational  in  general. 

It  is  also  easy  to  see  that  an  IN-rational  sequence  is  the  enumerative  sequence 
of  the  leaves  of  a  rational  tree.  A  normalized  representation  can  be  used  to  do 
that  by  “developing”  the  tree.  The  root  will  correspond  to  the  initial  state  of 
the  graph.  If  a  node  of  the  tree  at  height  n  corresponds  to  a  state  i  in  the  graph 
which  has  r  outgoing  edges  ending  at  states  jij2,  it  will  admit  r  sons  at 

height  71-1-1,  each  of  them  corresponding  respectively  to  the  states  •  •  •,> 

of  the  graph.  The  leaves  of  the  tree  will  correspond  to  the  final  states  of  the 
normalized  representation.  The  maximal  number  of  sons  of  a  node  we  get  is 
then  equal  to  the  maximal  number  of  edges  going  out  of  any  state  of  the  graph 
of  this  representation. 

If  s  satisfies  Kraft’s  inequality,  the  above  construction  does  not  lead  in  general 
to  a  Avary  rational  tree.  The  aim  of  this  paper  is  to  get  a  A?-ary  rational  tree  T 
such  that  s  =  /(T).  This  result  was  conjectured  in  [9].  We  solve  it  for  all  IN- 
rational  sequences  satisfying  Kraft’s  strict  inequality  and  give  a  weaker  result 
for  the  equality  case. 
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2.2  Approximate  eigenvector  and  state  splitting 

Let  5  be  an  Evf-rational  sequence  and  let  (G,  i,  F)  be  a  normalized  representation 
of  s.  If  we  identify  the  initial  state  i  and  all_ final  states  of  F  in  a  single  state 
still  denoted  z,  we  get  a  new  graph  denoted  G,  which  is  strongly  connected.  The 
sequence  s  is  then  the  length  distribution  of  the  paths  of  first  returns  to  state  i, 
that  is  of  finite  paths  going  from  i  to  i  without _going  through  state  i.  Using  the 
terminology  of  symbolic  dynamics,  the  graph  G  can  be  seen  as  an  irreducible 
shift  of  finite  type  (see,  for  example,  [3],  [4]  or  [6]).  _ 

We  denote  by  M  the  adjacency  matrix  associated  to  the  gr^h  G,  that  is  the 
matrix  M  —  (niij)i<ij<n,  where  n  is  the  number  of  nodes  of  G  and  where  niij 
is  the  number  of  edges  going  from  state  i  to  state  j.  By  the  Perron- Frobenius 
theorem  (see  [6]),  the  positive  matrix  M  associated  to  the  strongly  connected 
graph  G  has  a  positive  eigenvalue  of  maximal  modulus  denoted  by  A,  also  called 
the  spectral  radius  of  the  matrix.  Actually,  A  only  depends  on  the  series  s,  1/A 
is  the  minimal  modulus  of  the  poles  of  The  dimension  of  the  eigenspace  of 
A  is  equal  to  one.  There  is  a  positive  eigenvector  (componentwise)  associated  to 
A.  Moreover,  if  there  is  a  positive  eigenvector  associated  to  an  eigenvalue  /?,  then 

When  A  is  an  integer  ,  the  matrix  admits  a  positive  eigenvector.  When 

\  <  k,  where  k  is  an  integer,  the  matrix  admits  a  k- approximate  eigenvector, 
that  is,  by  definition,  a  positive  integral  vector  v  with  Mv  <  kv. 

For  example  the  left  side  of  the  figure  below  gives  a  representation  (G,  i,  F) 
of  the  serie  s(z)  —  and  the  right  side  gives  the  associated  graph  G.  The 

adjacency  matrix  of  G  is 

/O  30\ 

M  =  1  0  1 

Voio/ 

Its  maximal  eigenvalue  is  A  =  2.  The  components  of  a  positive  integral  eigenvec¬ 
tor  are  written  on  the  nodes. 

©05X30 

Fig.  3.  Graph  G 

Pi’oposition  1.  If  s  satisfies  KrafFs  inequality  s{l/k)  <  I,  then  A  <  k.  In  the 
equality  case  where  s(l/k)  =  1  we  have  X  =  k. 

For  a  proof,  we  refer  the  reader  to  [3] ,  [4]  or  [6] . 

We  now  define  the  operation  of  output  state  splitting  in  a  graph  G  =  (U,  E). 
Let  q  be  a  vertex  of  G  and  let  I  (resp.  O)  be  the  set  of  edges  coming  in  q  (resp. 
going  out  of  q).  Let  G  —  O'  +  O"  be  a  partition  of  O.  The  operation  of  (output) 


Fig.  2,  Representation  (G,  i,  F) 
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state  splitting  relative  to  (0',0")  transforms  G  into  the  graph  G'  =  (V',E') 
where  V'  =  (F  \  {q})  U  q'  U  q"  is  obtained  from  V  by  splitting  state  q  into  two 
states  q'  and  q" ,  and  where  is  defined  as  follows: 

1.  all  edges  of  E  that  are  not  incident  to  q  are  left  unchanged. 

2.  the  both  states  q'  and  q"  have  the  same  input  edges  as  q. 

3.  the  output  edges  of  q  are  distributed  between  q'  and  q"  according  to  the 
partition  of  O  into  O’  and  O" .  We  denote  U'  and  U”  the  sets  of  output 
edges  of  q’  and  q"  respectively  : 

U'  =  {((/',«,?)  I  {q,x,p)  €  O’}  and  U"  =  |  (q,x,p)  €  O"}. 

U’ 


U” 

Fig.  4.  Graph  G  Fig.  5.  Graph  G' 

Let  us  now  assume  that  v  is  a  A:- approximate  eigenvector  for  the  graph  G. 
We  denote  by  Vp  the  component  of  index  p  of  v.  All  components  Vp  are  positive 
integers.  A  state  splitting  of  a  state  q  is  said  to  be  admissible  according  to  k,  if 
the  partition  in  O'  and  O"  is  such  that  O'  and  O"  are  not  empty  and: 

k  divides  Vr 

{q,x,r)eO' 

If  the  state  splitting  is  admissible  (according  to  k),  the  vector  defined  as 
follows  will  be  a  A^-approximate  eigenvector  for  the  new  graph  G' .  If  p  is  a  state 
distinct  from  q'  and  q"  then  v'p  -Vp.  For  states  q'  and  q"  we  have: 


—  - 


E 

{q,x,r)eO' 


and 


By  the  state  splitting  construction,  one  can  check  that  M'v'  <  kv' ,  where  M' 
is  the  adjacency  matrix  of  G' . 

The  state  splitting  algorithm  of  [1]  ensures  that  there  is  a  finite  number  of 
state  splittings  leading  to  a  A:-ary  graph,  that  is  a  graph  such  that  at  most  k 
edges  are  going  out  of  any  state.  For  the  sake  of  completeness,  we  briefly  recall 
the  proof.  If  there  is  a  state  q  which  admits  more  than  k  edges  going  out  of  it, 
we  choose  k  of  them  and  denote  by  ri,  r2,  . .  • ,  the  sequence  of  end  states  of 
these  edges.  We  then  choose  a  subset  O'  of  these  k  edges  such  that  k  divides 
E(g  .r  r)eO'  ■  This  is  always  possible.  Indeed,  by  considering  the  k  +  l  numbers 

,  i’7’a  +  fVs ,  •  •  • ,  3 - )  we  can  see  that  at  least  two  of  them  are  equal 

modulo  k,  and  then  their  difference  is  equal  to  zero  modulo  k.  The  partition  of 
the  output  edges  of  q  in  O’  and  O”  leads  to  an  admissible  state  splitting  and  v'^ 
is  strictly  less  than  Vq.  This  point  ensures  that  the  process  stops  after  a  finite 
number  of  splits,  the  final  number  of  states  being  bounded  by  the  sum  of  the 
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components  of  the  initial  approximate  eigenvector.  The  final  graph  obtained  is 
Avary. 

We  shall  compute  approximate  eigenvectors  for  the  strongly  connected  graphs 
G  associated  to  normalized  representations  (G,  i,  F)  of  sequences.  We  shall  then 
perform  admissible  state  splittings  that  can  be  seen  either  on  the  graph  G  or 
on  the  graph  G.  To  do  that,  we  shall  associate  to  each  node  of  G  a  value  equal 
to  the  corresponding  component  of  the  approximate  eigenvector  of  the  graph  G. 
The  initial  and  the  final  states  will  have  same  value  since  they  correspond  to  the 
same  state  of  G. 


3  The  results 

We  now  state  the  result  in  the  case  of  Kraft  strict  inequality. 

Theorem  2.  Let  s  =  (sn)n>i  an  '^-rational  sequence  of  nonnegative  integers 
et  let  k  be  an  integer  such  that  Xlri>i  <  1-  Then  there  is  a  k-ary  rational 

tree  such  that  s  is  the  enuinerative  sequence  of  its  leaves. 

In  order  to  prove  this  result,  we  first  prove  one  lemma  that  remains  true  in  the 
equality  case.  We  therefore  consider  an  IN-rational  sequence  s  and  an  integer  k 
such  that  <  1.  We  begin  with  a  normalized  representation  (G,  i,  F) 

of  the  IN-rational  sequence  s.  We  denote  by  M  the  adjacency  matrix  of  G  and  by 
A  its  spectral  radius.  Then  X  <  k.  We  then  compute  a  Ar-approximate  eigenvector 
V  =:  (vi,  voj  •  •  • )  of  the  graph  G.  By  definition,  we  have  Mv  <  kv.  Without 
loss  of  generality,  we  can  assume  that  state  1  is  the  initial  state  in  all  normalized 
representations. 

Lemma  3.  If  k  divides  vi,  then  there  is  another  normalized  representation  for 
s  and  a  new  corresponding  approximate  eigenvector  v'  with  v[  =  vi  div  k. 

Proof  We  denote  by  P  the  set  of  states  q  such  that  there  is  in  G  an  edge  denoted 
[q,  xG)  going  from  ^  to  a  final  state  t  of  F .  Remark  that,  as  state  t  is  equal  to 
state  1  in  G,  the  value  of  state  t  is  equal  to  the  value  of  state  1. 

Let  us  first  suppose  that  the  initial  state  1  does  not  belong  to  the  set  P.  If 
there  is  in  P  a  state  q  which  admits  more  than  one  (say  n)  outgoing  edges,  we 
split  q  in  q'  and  q"  according  to  partition  (O',  O")  where  O'  —  {(^,  a?,  t)}.  Since  k 
divides  vi,  this  state  splitting  is  admissible  and  v'^,  —  vi  div  k.  Moreover,  in  the 
new  graph  G',  q'  admits  only  one  outgoing  edge  (going  to  t)  and  q"  is  either  not 
in  P  or  admits  less  than  n  outgoing  edges.  By  successive  state  splittings  of  all 
states  in  P  having  more  than  one  outgoing  edges,  we  will  get,  in  a  finite  number 
of  steps,  a  representation  such  that  all  states  with  one  outgoing  edge  ending  in 
F  have  no  other  outgoing  edges.  Under  the  hypothesis  that  state  1  does  not 
belong  to  P,  the  initial  state  has  not  been  split  during  this  processand  so  each 
new  computed  graph  is  still  a  normalized  representation  of  the  sequence.  We 
denote  again  by  (G,  1,  F)  the  final  representation  obtained  for  s  and  by  Piast  the 
set  of  states  having  one  outgoing  edge  ending  in  F  in  this  graph.  Remark  that 
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the  values  of  states  of  Piast  are  greater  than  or  equal  to  Vi  div  k.  We  turn  all 
values  of  states  of  Piast  greater  than  vi  div  k  into  vi  div  k;  the  vector  v  remains 
a  A’- approximate  eigenvector. 

We  then  transform  the  representation  {G,1,F)  in  a  new  one,  {H,i,  Piast), 
where  H  is  the  graph  obtained  from  G  by  adding  a  state  i,  an  edge  from  i  to 
1  and  by  removing  all  edges  of  G  going  out  of  a  state  of  Piast-  If  we  look  at 
paths  in  G  going  from  1  to  F,  we  have  just  cut  the  last  edge  and  added  one 
at  the  beginning.  We  assign  to  state  i  the  value  vi  div  k,  and  tl^e  values  of  all 
states  correspond  now  to  a  new  approximate  eigenvector  for  H .  We  call  this 
tranformation  the  “shift”  transformation. 

Let  us  now  suppose  that  the  initial  state  1  belongs  to  P.  We  first  split,  as 
explained  above,  all  states  of  P  having  more  that  one  outgoing  edge.  In  this 
case,  state  1  may  have  been  split.  We  denote  by  l(i),  1(2),  1(3),  •  •  •  l(r)  fke  copies 
of  state  1  obtained  by  successive  state  splittings  of  the  initial  state  1.  We  still 
denote  by  G  the  graph  obtained  by  this  transformation  and  by  Piast  the  set  of 
states  having  one  outgoing  edge  ending  in  F  in  this  graph.  We  then  transform 
the  representation  (G,  1,P)  into  a  new  one,  {H,i,  Piast),  where  H  is  the  graph 
obtained  from  G  by  adding  a  state  z,  an  edge  from  i  to  each  l(j),  1  <  j  <  z’  and 
by  removing  all  edges  of  G  going  out  of  a  state  of  Piast-  Remark  that  (r  —  1) 
states  among  l(i) ,  1(2),  1(3),  •  •  •  l(r)  belong  to  Piast-  We  again  assign  to  the  state 
i  the  value  vi  div  k,  and  the  values  of  all  states  correspond  now  to  a  new  k- 
approximate  eigenvector  for  H . 

Corollary  4.  If  vi  is  a  power  of  k,  then  there  is  another  normalized  represen¬ 
tation  and  a  new  corresponding  approximate  eigenvector  F  with  v'^  —  1. 

Proof.  If  vi  =  we  iterate  the  construction  given  in  previous  lemma  and  get 
zq  =  1  in  ni  steps. 


Example 

Let  cS  be  the  following  series: 

s[z)  =  ‘lz^^2z'^  {zHz'^yy 

Here,  k  —  2  and  s{l/2)  1. 

In  the  following  pictures,  the  nodes  are 
labeled  with  their  value. 


Fig.  6.  Initial  normalized 
representation 
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Fti'st  step 


Fig.  7.  First  state  splitting  Fig.  8.  First  “shift” 


Second  step 


Fig.  9.  Other  state  splittings  Fig.  10.  Second  “shift” 


The  last  step  is  described  in  the  proof  of 
Theorem  2. 

It  corresponds  here  to  a  state  splitting  of 
all  states  of  the  graph  of  value  different 
from  1. 


Fig.  11.  Last  representation 

We  now  prove  another  lemma  which  is  true  only  in  the  case  of  Kraft  strict 
inequality. 


Lemma  5.  Let  M  be  a  nonnegative  integral  matrix.  If  its  spectral  radius  is 
strictly  less  than  k,  then  there  is  a  k-approximate  eigenvector  w  of  M  such 
that  w\  is  a  power  of  k. 
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Proof.  Let  A  (A  <  k)  be  the  positive  real  eigenvalue  of  maximal  modulus  of  M 
and  let  v  be  an  eigenvector  associated  to  A.  We  denote  by  P  the  set  of  positive 
vectors  w  such  that  Mw  <  kw.  The  set  P  is  an  open  set  and  v  belongs  to 
P.  By  dividing  all  components  of  v  par  I'l,  we  can  assume  that  ^>1  is  equal 
to  1.  As  P  is  open,  there  is  a  positive  real  e  such  that  B{v,€)  C  P,  where 
B{v,€)  =  {w  I  Vi  —  €  <  Wi  <  Vi  e}.  Let  us  now  choose  an  integer  m  such  that 
1/k^  <  e.  As  B{v,l/k^)  C  F,  we  have  {k^w  \  w  G  B{v,  1/k^)}  C  F.  This  set 
is  {w  I  k'^^Vi  -  1  <  ivi  <  k^Vi  +  1}  and  contains  w  where  Wi  -  \k^Vi].  This 
vector  is  a  positive  integer  vector  w  with  Mw  <  kw  :  it  is  a  /^-approximate 
eigenvector.  Moreover  w\  —  k^ . 

Proof.  (Theorem  2)  We  begin  with  a  normalized  representation  of  s  and  com¬ 
pute,  by  Lemma  5,  a  Ar- approximate  eigenvector  whose  component  for  the  initial 
state  is  a  power  of  k.  We  then  compute,  by  Corollary  4,  a  normalized  representa¬ 
tion  (G,  1,  F)  of  s  which  admits  a  A;-approximate  eigenvector  of  component  1  for 
the  initial  state.  Finally,  we  apply  to  G  the  state  splitting  algorithm  described 
in  the  previous  section  to  obtain  a  Ar-ary  graph.  As  the  component  of  the  ap¬ 
proximate  eigenvector  on  the  initial  state  is  1  and  as  the  state  splittings  have  to 
be  admissible,  this  state  will  never  be  split  during  the  process.  A  state  splitting 
of  a  state  of  G  different  from  state  1  leads  by  construction  to  a  graph  G'  still 
representing  the  same  sequence.  The  result  follows  then  from  the  fact  that  the 
final  normalized  representation  has  a  Ai-ary  graph. 

We  can  apply  the  construction  given  above  to  the  case  of  Kraft  equality  when 
it  is  possible  to  find  a  representation  of  s  which  admits  a  Ar-eigenvector  with  a 
power  of  k  as  component  on  the  initial  state.  This  may  perhaps  not  always^  be 
the  case.  We  do  not  know,  for  example,  if  the  series  5(2:)  =  +  {i-5z^ 

(communicated  to  us  by  Christophe  Reutenauer)  has  such  a  representation  for 
k  =  2; 

As  a  consequence  of  the  previous  result,  we  get  the  following  proposition  in 
the  equality  case,  where  an  ultimately  Ar-ary  tree  is  a  tree  where  all  nodes  but  a 
finite  number  have  at  most  k  sons. 

Proposition 6.  Let  s  =  {sn)n>i  be  an  IN-ratiorial  sequence  of  nonnegative  in¬ 
tegers  and  let  k  be  an  integer  such  that  J2n>i  1-  Then  there  is  an  ulti¬ 

mately  k-ary  rational  tree  such  that  s  is  the  enumerative  sequence  of  its  leaves. 

Proof  If  we  remove  one  term  of  the  sequence,  the  remainder  satisfies  Kraft’s 
strict  inequality  and  is  still  IN-rationaL  This  proves  that  one  can  construct  a 
rational  tree  T  for  s  which  will  be  Ar-ary  for  all  nodes  except  the  root  which  will 
have  Ar  -f  1  sons. 

We  now  state  another  result  for  the  equality  case  which  is  weaker  than  the 
previous  theorem.  We  show  that  if  s  is  an  IN-rational  sequence  of  nonnegative 
integers  satisfying  Kraft’s  equality  for  an  integer  Ar,  then  there  is  an  integer  m 
such  that  ms  is  the  sum  of  m  enumerative  sequences  of  leaves  of  m  Ar-ary  rational 
trees. 
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Theorem  7.  Let  s  be  a  an  IN-rahona/  sequence  satisfying  s^k  ”  —  1. 

There  is  a  positive  integer  m  and  m  k-ary  rational  trees  Ti,. .  .,Tm  such  that 
ms  =  I  (Ti  )  +  •••  +  ^  {Tm ) . 

Proof.  We  begin  with  a  nornaalized  represention  of  s  ;  (G,  I,  F).  Let  v  be  a  posi¬ 
tive  integral  eigenvector  associated  to  the  spectral  radius  k  of  the  adjacency  ma¬ 
trix  of  G.  The  component  vi  on  the  initial  state  1  is  denoted  by  m.  If  m 
where  m'  and  k  are  relatively  prime,  we  compute,  by  Corollary  4,  a  normalized 
represention  of  s  such  that  in  order  to  get  a  smaller  integer  m.  After 

this  step,  m  and  k  are  relatively  prime.  If  m  =  1,  we  finish  by  the  same  proof  as 
the  proof  of  Theorem  2. 

Otherwise,  we  denote  by  r  the  positive  integer  such  that  k^~^  <  rn  <  k^ 
and  X  =  k"^  -  m.  We  define  a  new  graph  H  by  adding  to  G  (r  -f  1)  new  states 
ii,  G,  .  . . ,  ir  and  j,  and  with  the  following  new  edges  : 


l2 


and  X  edges  (in  the  multigraph)  going  from  v  to  j  : 

ir  — j  edges) 

We  assign  to  state  ii  the  value  k^~^  and  to  state  j  the  value  1  (state  1  has  value 
m).  For  each  state  t  in  F  v/e  make  the  following  transformation.  We  replace  t  by 
rn  copies  of  this  state  and  we  duplicate  each  edge  entering  tin  m  edges  entering 
the  m  copies  of  t.  We  give  to  all  copies  of  t  the  value  1.  We  denote  by  g  the  new 
sequence  which  admits  as  normalized  representation  (if,zi,F  U  {i}).  Note  that 
the  values  of  states  of  H  correspond  now  to  a  ^-integer  eigenvector  of  H  which 
admits  1  as  component  on  the  initial  state  ii.  Using  the  series  notations,  one 
can  verify  that 

9{^)  =  =  xz''  +  mz''s{z), 

n>l 

and  by  construction;  g{l/k)  =  1. 

Since  the  representation  of  ^  has  an  eigenvector  of  component  1  at  the  initial 
state,  g  is  the  height  distribution  of  leaves  of  a  A;-ary  rational  tree  (by  applying 
the  construction  of  theorem  I  in  the  case  where  the  value  of  the  initial  state  is 

1).  The  series  g  is  either  equal  to  1  or  to  zgi  d-  2:^2  H - Vzgk,  where  gi  are  again 

series  of  this  type  (gi  is  the  height  distribution  of  the  leaves  of  the  subtree  rooted 
by  a  son  of  the  root  of  the  tree  representing  g).  By  iterating  this  decomposition 
for  each  gi ,  we  can  write 

g{z)  -  z’'(fi{z)  +  f2(z)  +  ■  •  •  +  /fcr(z))  =  xz’’  +mz''s{z), 

where  fi  are  height  distributions  of  leaves  of  ^-ary  rational  trees,  which  we 
simplify  into: 

/i  +  /2  +  •  •  •  +  //c^  ~  X  -\-  ms 

As  all  fi  have  nonnegative  integer  coefficients  and  satisfy  fi(l/k)  =  1,  this  implies 
that  X  series  among  /i ,  /2,  •  •  • ,  are  equal  to  1.  The  m  remainding  series  that 

we  renumber  /i ,  /2,  •  •  • ,  /m  verify  the  equality  :  /i  +  /2  H - h  /m  =  rns,  which 

is  the  announced  result. 


86 


Example  Let  s  be  the  following  series: 

-  1  _  ,-2  +  1  _  2^3  ■ 

We  get  that  3s  =  /a  +  /s+/c,  where  fx 
is  the  height  distribution  of  the  leaves  of 
the  tree  rooted  by  the  node  X  in  the  last 
picture. 


Fig.  13.  The  sequence  g 


Fig.  12.  The  sequence  s 


Fig.  14.  The  sequence  g  after  state 
splittings 
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Abstract.  We  show  that  any  rational  code  with  bounded  synchroniza¬ 
tion  delay  is  included  in  a  rational  maximal  code  with  bounded  synchro¬ 
nization  delay. 


1  Introduction 

The  theory  of  codes  is  originated  from  Shannon’s  works  on  information  theory. 
It  is  now  a  well-developed  branch  of  theoretical  computer  science.  We  refer  to 
[7,  31]  for  a  systematic  exposition  of  the  topic,  to  [6,  21]  for  application  of  codes 
in  symbolic  dynamics  and  coding  for  constrained  channels,  and  to  [17]  for  a 
survey  on  codes  used  in  the  context  of  information  transmission  systems. 

A  lot  of  beautiful  properties  provide  a  good  understanding  of  the  structure  of 
codes.  Nevertheless,  several  problems  on  codes  remain  unsolved  despite  the  effort 
of  researchers  [8,  9].  In  this  paper,  we  are  interested  in  the  following  problem  : 
given  a  code  A"  with  some  property  V,  find  (if  it  exists)  an  effective  procedure  to 
embed  X  into  a  maximal  code  Y  with  the  same  property  V.  Effective  embedding 
procedures  exist  for  rational  codes  [14],  rational  codes  with  bounded  deciphering 
delay  [10,  3],  rational  biprefix  codes  [23,  33].  The  case  of  finite  codes  is  particular  : 
there  exist  finite  codes  included  in  no  finite  maximal  codes  [27,  19].  One  of  the 
main  open  problems  on  codes  is  whether  the  inclusion  of  a  finite  code  in  a  finite 
maximal  code  is  decidable. 

We  here  show  that  any  rational  code  with  bounded  synchronization  delay  is 
effectively  embeddable  into  a  rational  maximal  code  again  with  bounded  syn¬ 
chronization  delay.  Codes  with  bounded  synchronization  delay  [16]  are  part  of 
the  family  of  circular  codes,  i.e.,  codes  defining  a  unique  factorization  of  words 
written  on  a  circle  [20]  or  of  biinfinite  words  [12].  Circular  codes  and  codes  with 
bounded  synchronization  delay  have  numerous  interesting  properties.  For  in¬ 
stance,  sequences  of  integers  which  are  the  length  distribution  of  a  circular  code 
are  completely  characterized  [30,  28,  4];  codes  appearing  in  factorizations  of  free 
monoids  are  necessarily  circular  [29]  (see  also  [32,  18,  13]  for  the  description  of 
circular  codes  used  in  finite  factorizations) ;  codes  with  bounded  synchronization 
delay  satisfy  the  commutative  equivalence  conjecture  [24];  encoding  digital  data 
for  transmission  through  constrained  channels  involve  circular  codes  [15,  1,  5]; 
recently  a  set  of  codons  constituting  a  circular  code  has  been  identified  in  the 
study  of  the  repartition  of  trinucleotides  in  the  protein  of  coding  genes  [2]. 
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2  Codes  with  bounded  synchronization  delay 

For  the  notions  given  in  this  section  and  the  next  one,  we  refer  to  [7]  and  [6]. 

Given  a  finite  alphabet  A,  a  code  X  C  yi*  is  a  set  of  words  such  that  for  all 

;r  1 , . . . ,  x„, ,  , . . . ,  G  ^ , 

^1  "  ■  *  ~  t/l  ‘  '  Vm  ^  71  —  Tfly  Xi  —  Xfi  Vi. 

This  definition  means  that  any  coded  message  iCi  •  •  is  uniquely  decoded  into 
the  code- words  a:i, . . . , 

We  are  here  interested  in  codes  with  bounded  synchronization  delay.  Such 
codes  allow  to  easily  localize  a  position  into  a  coded  message  through  which  the 
decoding  must  pass,  and  thus  to  decode  the  two  parts  separately.  Formally,  a 
code  has  a  bounded  synchronization  delay  if  there  exists  <7  >  0  such  that  (see 
Figure  1) 

uxyv  £  X* ,  with.x,yE:X^  =>  ux,yvGX*.  (1) 

The  smallest  integer  a  satisfying  (1)  is  called  the  synchronization  delay  of  the 
code  X. 


a  o 


^  i  >’  X  i  y 


Fig.  1.  Synchronization  delay  0-.  Fig.  2.  Synchronization  delay  cr  when 

coimting  with  letters. 


Example  L  The  code  X  ~  a*  b  has  synchronization  delay  1,  since  the  letter  b  only 
occurs  at  the  end  of  the  code-words.  On  the  opposite,  the  code  X  =  ab*cUb  has 
no  bounded  synchronization  delay,  because  6^^  is  factor  of  the  word  of  X, 

for  any  a. 

There  is  another  way  to  define  the  synchronization  delay  of  codes.  Instead 
of  counting  with  words  as  done  in  Definition  (1),  one  can  count  with  letters  as 
follows.  We  denote  by  P{X*)  the  set  of  the  prefixes  of  the  words  of  X* ,  and  by 
Pa{X*)  the  set  P[X*)r\A^ .  For  suffixes  we  use  the  notations  S'(X*)  and  5ct(X*). 
A  code  X  A*  has  a  bounded  letter-synchronization  delay  if  there  exists  cr  >  0 
such  that  (see  Figure  2) 

uxyv  ^  X* ,  with  x  E  Sa{X*),y  £  Pa{X*)  ux,yvEX*.  (2) 

The  smallest  integer  a  satisfying  (2)  is  called  the  letter- synchronization  delay  of 
the  code  X. 

For  finite  codes,  both  synchronization  delays  (on  words  or  on  letters)  are 
bounded  simultaneously.  This  is  no  longer  true  for  infinite  codes. 
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Example  2.  Let  A  =  {a,  6,  c)  and  X  C  A’*'  equal  to  a*6Uca*6a*6.The  code  X  has  a 
synchronization  delay  2  if  counting  with  words,  but  no  bounded  synchronization 
delay  if  counting  with  letters. 

The  family  of  codes  of  bounded  synchronization  delay  or  bounded  letter- 
synchronization  delay  is  included  in  the  one  of  circular  codes.  We  recall  that  a 
code  is  circular  if 

UX2  •  •  •  Xn'V  =  J/l  •  •  •  J/m  ,  Xi  =  VU  =>  U  =  TTl,  V  =  I,  Xi  =  Pi  VL 

In  the  case  of  finite  codes  X  C  A* ,  the  concepts  of  circular  code,  code  with 
bounded  synchronization  delay,  code  with  bounded  letter-synchronization  delay 
coincide.  Moreover,  these  code  properties  are  equivalent  to  : 

X*  =  PU  (UA*  n  \  A*WA*) 

with  P,  f/,  V,  W  finite  subsets  of  A* . 

However  for  rational  codes  (that  is,  codes  recognized  by  a  finite  automaton), 
there  exist  circular  codes  with  an  infinite  synchronization  delay,  as  the  code 
X  =  ab*cUb  mentioned  in  Example  1.  As  a  matter  of  fact,  a  rational  circular 
code  has  a  bounded  synchronization  delay  if  and  only  if  3p,  X  DA*  X^  A*  =0. 

Remark.  Any  code  X  C  A*  with  synchronization  delay  0  (on  words  or  on  letters) 
is  nece.ssarily  included  in  the  alphabet  A.  From  now  on,  we  suppose  that  cr  >  1 
in  a  way  to  discard  such  trivial  codes. 

3  Completion’s  problem 

In  this  paper,  we  solve  Problem  8  of  [9]  about  the  completion  of  codes  with 
bounded  synchronization  delay. 

Recall  that  complete  codes  X  C  A*  are  codes  such  that  any  word  over  A  is 
factor  of  a  coded  message  : 

\fweA%  7^  0. 

It  is  well-known  [7]  that  for  rational  codes,  this  combinatorial  property  is  equiv¬ 
alent  to  the  extremal  property  of  being  a  maximal  code  (with  respect  to  the 
inclusion). 

We  here  prove  that  codes  with  bounded  synchronization  delay  (on  words  or 
on  letters)  can  be  embedded  into  a  complete  one.  The  case  of  bounded  letter- 
synchronisation  delay  is  solved  separately,  since  the  two  notions  of  delay  differ 
for  infinite  codes  (see  Example  2). 

Theorem  1.  Let  X  C  A*  be  a  code  with  synchronization  delay  a.  Then  X  can 
be  embedded  into  a  complete  code  Y  C  A*  with  synchronization  delay  c’  <  2(t. 
Moreover  if  X  is  rational^  then  Y  is  also  rational. 
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Theorem  2.  Let  X  C  A*  be  a  code  with  letter- synchronization  delay  a.  Then  X 
can  be  embedded  into  a  complete  code  Y  C  A*  with  letter- synchronization  delay 
cr'  <  3(7  —  2.  Moreover  if  X  is  rational,  then  Y  is  also  rational. 

The  metho(J  given  in  [14]  for  embedding  a  rational  code  X  into  a  rational 
maximal  code  T,  also  works  for  rational  circular  codes  [11,  4].  Hence  any  rational 
circular  code  is  included  in  a  maximal  one.  However  this  method  is  not  able  to 
keep  the  bounded  synchronization  delay  from  X  ioY . 

The  proofs  of  Theorems  1  and  2  are  given  below  in  the  next  two  sections. 
They  are  based  on  the  following  propositions  which  state  a  simple  combinatorial 
property  of  complete  codes  with  a  bounded  synchronization  (letter-synchronization 
resp.)  delay. 

Propositions.  Let  X  C  A*  be  a  code.  Then  X  is  a  complete  code  with  syn¬ 
chronization  delay  <  <J  if  and  only  if  X^A*X^  C  X*.  □ 

Proposition  4 .  Let  X  C  A*  be  a  code.  Then  X  is  a  complete  code  with  letter- 

synchronization  delay  ^  o'  if  and  only  if  Pcr{X*)A*  Sa{X*)  C  X* .  □ 

The  next  example  shows  that  the  bound  2a  of  Theorem  1  is  tight.  The  bound 
3(7  -  2  of  Theorem  2  is  also  tight,  but  the  example,  more  elaborated,  is  omitted. 

Examples.  Consider  the  alphabet  A  =  {a,b,c,d}  and  an  integer  a  >  1.  The 
set  X  =  {a,caS^~'^b,ba^^~^d,cb^^~^d}  is  a  code  over  A  with  synchronization 
delay  a.  Assume  that  X  can  be  included  in  a  complete  code  Y  C  A*  with  a 
synchronization  delay  a'  <  2a  —  1.  By  Proposition  3,  one  has  Y^  A^Y"^  C  Y* . 
Then 

GT*,  a'^^-Ua^  eY*. 

The  word  ca?^~^ba?^~^da^'  decomposes  into  words  of  Y*  as  indicated  in 
Figure  3.  As  Y  is  a  code,  b  must  belong  to  Y.  But  b"^^'  is  factor  of  the  word 
cb'^'^~^d  of  Y,  in  contradiction  with  the  synchronization  delay  a*  of  Y. 


Fig.  3.  The  bound  2a  is  tight.  Fig-  4.  z  is  not  factor  of  a;. 


4  Embedding  when  counting  with  words 

In  this  section,  X  is  a  given  code  over  the  alphabet  A  with  a  bounded  synchro¬ 
nization  delay  a.  The  way  to  embed  X  into  the  code  Y  mentioned  in  Theorem  1 
is  done  in  two  steps  :  construct 

Y  =  Base(M)  =  (V  \  1)  \  (M  \  1)^ 


(3) 
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In  addition  of  X* ,  the  monoid  M  contains  all  the  words  beginning  and  ending 
with  markers  z  E  The  use  of  markers  already  appears  in  the  method  of 

[14]  :  one  marker  only  is  used,  given  by  an  unbordered  word  which  is  factor  of 
no  word  of  A* . 

The  following  simple  lemmas,  together  with  Proposition  3  lead  to  a  proof  of 
Theorem  1  :  Lemmas  5  and  6  show  that  T  is  a  code  containing  A.  This  code  is 
proved  to  be  complete  with  synchronization  delay  cr'  <  2cr  thanks  to  Lemma  7 
and  Proposition  3.  Clearly  if  A"  is  rational,  so  is  V . 

Lemma  5.  Y  is  a  code. 

Proof.  To  prove  that  Y  is  a  code,  we  show  that  the  monoid  M  is  stable  (see  [7])  : 
if  i/,  wv^  uw ,  G  -A7 ,  then  w  E  Af  •  Assume  that  this  is  not  the  case,  and  considei 
a  word  uwv  of  minimal  length  such  that 

u,  ivv,  iiw,  V  E  M  but  w  ^  M.  (4) 

We  begin  with  three  claims  concerning  u  and  wv.  Symmetrically  they  hold 
for  V  and  uw. 

Claim  1.  If  u  =  rz  luith  z  E  A^*^  and  uw  -  sx  with  x  e  X* ,  then  z  is  not 
factor  of  X  (see  Figure  4). 

If  2  is  factor  of  x,  then  due  to  the  synchronization  delay  cr  of  A,  we  get  z  ==  2:122, 
X  =  X1X2  with  21,22  6  X^,  xi,X2  E  A*  and  rzi  =  sxi,  Z2wv  =  X2V.  The 
second  equality  links  shorter  words  satisfying  a  relation  similar  to  (4) .  This  is 
impossible. 

Claim  2.  Iftvv  =  zr  with  2  E  A^^  and  uw  =  sx,  v  =  x' s'  with  x,x'  E  A*, 
then  z  is  not  factor  of  xx'  (see  Figure  5). 


Fig.  5.  2  is  not  factor  of  xx'.  Fig.  6.  tu  is  a  proper  prefix  of  2. 

Assume  that  2  is  factor  of  xx'  and  let  z  =  ziZ2  with  21,22  E  A^.  If  w  is  prefix 
of  2i,  we  get  the  same  contradiction  as  done  in  Claim  1.  So  21  is  prefix  of  w. 
By  the  synchronization  delay  a  of  A,  we  have  x  =  X1X2  with  xi,X2  G  X*  and 
uzi  =  sxi.  It  follows  that  w  =  21X2  belongs  to  A*,  a  contradiction  with  (4). 

Claim  3.  If  wv  =  zr  with  2  E  A^^,  then  w  is  a  proper  prefix  of  2  (see 
Figure  6). 

Assume  the  contrary,  i.e.,  2  is  prefix  of  w.  By  Claim  2,  uw  belongs  to  Af  \  A*. 
Moreover  any  suffix  z'  E  A^^  of  uw  is  a  proper  suffix  of  w,  again  by  Claim  2.  It 
follows  that  w  E  A^^A*  fl  A*  A^"^  C  M,  a  contradiction  with  (4). 

We  now  end  the  proof.  In  (4),  at  least  one  of  the  words  u,wv,uw  and  v  is  in 
M  \  X*  since  A  is  a  code. 

Assume  that  u  E  Af  \  A*  and  let  u  =  rz  with  2  E  A^^.  It  follows  by  Claim  1 
that  uw  E  M  \  AC  Let  uw  =  r'z'  with  z'  E  A^^.  Again  by  Claim  1,  we  get 
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|r|  <  |r^|.  Consider  now  the  word  wv.  It  must  belong  to  M  \  X*  by  Claim  2 
applied  to  uw.  Let  wv  —  z^'r"  with  z*'  G  X"^^ .  Then  \z"\  >  |u;|  by  Claim  3.  But 
this  is  in  contradiction  with  Claim  2  applied  to  uw. 

Assume  that  u  G  X*  and  wv  G  M  \  X* .  Then  wv  =  zr  with  0  G  and 
\z\  >  by  Claim  3.  Claim  2  applied  to  uw  shows  that  uw  G  X* .  It  follows  that 
v  ^  M\X*  again  by  Claim  2  applied  to  wv.  Let  v  —  z*r'  with  z*  G  X"^^ .  Claim  1 
applied  to  v  leads  to  z  being  factor  of  uwz'  G  X* .  This  is  in  contradiction  with 
Claim  2. 

The  other  cases  are  symmetrical.  Therefore,  assumption  (4)  is  false,  w  £  M 
showing  that  Y  is  a  code,  □ 

Lemm  a  6,  ACT. 

Proof.  Assume  the  contrary,  that  is,  some  word  x  £  X  factorizes  as  yi‘--yn 
with  n  >  2  and  yi,. .  .,yn  EY.  At  least  one  of  these  words,  say  yi,  belongs  to 
Y\X  since  A  is  a  code.  As  yi  G  A^^A*  fl  A^A^'^  and  yi  is  factor  of  x,  this  leads 
to  a  contradiction  with  the  synchronization  delay  cr  of  A.  □ 

Lemm,a  7.  C  Y*. 

Proof.  By  (3),  we  have  y2^A*Y2'"  C  A^^A^A^^  CM  =  Y*.  □ 

5  Embedding  when  counting  with  letters 

In  this  section,  A  C  A*  is  a  code  with  letter-synchronization  delay  a.  We  show 
how  to  construct  the  complete  code  Y  of  Theorem  2  and  we  prove  the  correctness 
of  the  construction.  We  denote  by  r  the  constant  3cr  —  2. 

The  algorithm  uses  a  particular  operation  Z{M)  defined  by  (see  Figure  7) 

Z{M)  —  {w  E  A*  \X*  \w  =  zu  =  u* z' , With. 

z  EPr{M),z'  ESr{M)} 

U  {u;  G  A*  \  A*  I  there  exist  u  E  S{M),  u'  E  P{M)  with 
2:  =  wu^  E  Pr{M)^  P  —  uw  E  St{M)}. 

Notice  that  Z(M)  R  A*  =  0  and  that  Z[M)  is  the  union  of  two  sets,  one  with 
words  of  length  greater  than  or  equal  to  r,  the  other  with  words  of  length  less 
than  or  equal  to  r.  As  done  above  in  Section  4,  the  operation  Z  uses  markers  z 
in  Pt{M)  or  S’r(M)  (instead  of  A^^). 

z  z 


z 

Fig.  7.  Operation  Z. 


The  algorithm  works  as  follows  : 
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M  =  X* 

Repeat 
M'  =  M 

M  =  {Z{M)UMY 
until  M  -  M' 

Y  =  Base(M) 

The  proof  of  Theorem  2  is  done  in  a  similar  way  as  for  Theorem  1.  We  begin 
with  a  technical  lemma. 

Lemma  8,  M  \  A"*  C  Pfj[X*)A*  fl  A*Sa(X*). 

Proof.  We  are  going  to  prove  the  next  four  statements.  Lemma  8  is  a  corollary 
of  (4)  since  2a  ~  1  >  a. 

1.  Let  w  E  Z(h4)  with  length  |u;|  <  r  and  u  E  S(M),  u'  E  P{M)  such  that 

z  =  wu'  E  Pt{M)  and  z'  =  uw  E  Sr{M).  Then  uwu'  has  no  factor  in 

SAX*)Po{X*)- 

2.  Any  w  E  Z{M)  has  length  at  least  equal  to  2<7  —  1. 

3.  For  any  tv  e  Z{M),  let  2  -  li;  if  |uj|  <  r,  let  .ar  be  the  prefix  of  length  r  of  w 

otherwise.  Then  either  z  E  P{X*)  or  ^  has  a  proper  prefix  in  X*{Z(M)  fl 

P(A-)). 

Symmetrically,  let  z'  —  w  if  \w\  <  r,  let  z'  be  the  suffix  of  length  r  of  u; 
otherwise.  Then  either  z'  E  5(X*)  or  z'  has  a  proper  suffix  in  {Z{M)  0 
S(X*))X\ 

4.  For  any  tc  E  M  \  A"*,  w  has  a  prefix  (resp.  suffix)  with  length  2(T  -  1  in 
P{Xn  (resp.  5(A*)). 

The  four  statements  are  proved  by  induction  on  the  passes  through  the  repeat 
instruction  of  the  previous  algorithm.  We  denote  by  Mi  the  value  of  M  at  the 
beginning  of  the  repeat  instruction  at  pass  i.  Initially  Mi  =  X* .  Notice  that 
Mi  C  Mi+i  VL 

•  Pass  1.  At  this  stage,  consider  Z[Mi)  —  Z{X*). 

(1)  As  A  has  letter-synchronization  delay  a  and  r  >  (j,  we  have  \u\,  \u'\  <  a 
otherwise  w  E  X* .  Assume  that  uwii'  has  a  factor  in  Sa(X*)Pa(X*),  i.e., 

uwu'  —  rx\X2p  with  Xi  E  Sa{X*),X2  E  Pa{X*). 

Then  \u\  <  [rxi],  \u'\  <  \x2r'\.  Let  w  =  wiW2  such  that  uwi  =  rxi,  W2u’  ^  X2t' . 
By  the  letter-synchronization  delay  a  of  A,  we  get  wi.,W2  E  A*,  a  contradiction 
with  w  ^  A"* . 

(2)  The  statement  holds  for  words  w  of  Z{X*)  with  length  |ic|  >  r.  For  the 
other  words  we  use  the  notations  of  (1).  We  already  know  that  |u|,  |w'|  <  (t.  As 
\ii'w\  =  \wu'\  =  r,  it  follows  that  |ttj|  >  2a  ~  1. 

(3)  Clearly  E  P{X*)  (resp.  E  ^(A*)). 

(4)  As  a  consequence  of  (2)  and  (3),  any  word  of  (A(Mi)UMi)*\A*  =  M2\X* 
has  a  prefix  in  P2a-i{X*)  and  a  suffix  in  52<7_i(A*). 
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•  Pass  i,  with  i  >  1.  We  suppose  that  Z(Mi-i)  satisfies  (1)  — (3)  and 
Mi-iY  =  Mi  satisfies  (4).  Let  us  consider  Z(Mi). 

(1)  Let  u  G  S{Mi),  u'  G  P(Mi)  such  that  wu'  G  PriMi)  and  z'  ~  tnv  G 

Sr  (Mi). 

Assume  that  uwu*  has  a  factor  in  Sa(X*)Pa(X*),  i.e., 

uw^l'  —  rxiX2r'  with  xi  G  5^ (AT*), 0^2  €  PaiX*). 

To  get  a  contradiction,  the  idea  is  the  following.  We  first  suppose  that  \u\  <  \rxi\ 
and  Let  w  =  wiW2  such  that  uwi  =  rxi  and  W2u'  =  X2t' .  We  will 

prove  that 

Wi,W2  G  X* 

showing  that  w  G  X* ,  which  is  impossible.  If  |w|  >  |ra^i|  or  |u/|  >  |a:2r'|,  the 
contradiction  is  obtained  in  the  same  way.  Indeed,  suppose  that  |«|  >  >  a. 

Let  x\  (resp.  x^)  be  the  suffix  of  u  (resp.  prefix  of  z)  with  length  a.  By  induction 
hypothesis  (4),  x[  G  Sa(X*),  x'^  G  Pa(X*).  We  then  replace  xiX2  by  a^nd 
we  repeat  the  situation  just  described,  showing  that  w  £  X* . 

So,  consider  that  ju|  <  and  |u^|  <  [ir2^n*  Let  us  show  that  wi  G  X* 

{a  symmetrical  argument  shows  that  W2  G  X*).  Since  \x2\  =  cr  and  \z\  —  r ,  we 
have 

|u;i|  <2(7-1. 

Let  u"  —  7/  if  I w|  <  (7,  let  u"  be  the  suffix  of  u  with  length  a  otherwise.  Then 

u"g5(a:*) 

by  induction  hypothesis  (4).  This  situation  is  summarized  in  Figure  8. 


u  z 


u”  ^2 


Fig.  8.  Case  \u\  <  |ra;i|,  \u'\  <  \x2r'\. 

If  G  P(X*)  —  P(Mi),  then  xiX2  is  factor  of  u"z  G  S{X*)P{X*).  By  the 
letter-synchronization  delay  cr  of  AT,  it  follows  that  wi  G  X* . 

If  z  ^  P(X*),  let  2:  ==  xw's  such  that  x  G  X* ,  w'  G  Z(Mi-i)  and  s  G  P{Mi). 
By  induction  hypothesis  (1),  w'  has  no  factor  in  Sa{X*)Pa{X*).  Hence  either 
|r|  <  \ux\  or  <  |ra:;iiC2|- 

Consider  the  first  case.  We  know  that  w'  has  a  prefix  p  of  length  2c7  —  1 
in  P(X*)  by  induction  hypothesis  (4).  Therefore,  we  have  done  as  just  before 
because  X1X2  (of  length  2c7)  is  factor  of  the  word  u"xp  G  S(X*)P{X*). 

Suppose  now  that 

<  |r|  and  luxio'l  <  \rxiX2\ 
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and  let  us  show  that  this  case  cannot  occur  (see  Figure  9).  By  induction  hypoth¬ 
esis  (2),  we  have  \u/\  >  2a  —  I  and  then  l^j  <  a.  Thus  |raji|  <  \uxw'\  because 
I'U'i  I  <  2(7  —  1. 


Fig. 10.  Construction  of  w\ 

Fig.  9.  Case  \ux\  <  |r|,  \uxw*\  <  \rxiX2\. 


By  induction  hypothesis  (4),  w'  -  iy  with  y  G  S2a-i(X*).  Similarly,  s  G 
P(X*)  since  \s\  <  a.  Therefore  XiX2  is  factor  of  ys  G  5(X*)P(X*).  By  the 
letter-synchronization  delay  a  of  X,  it  follows  that  y  =  yiy2  with  uxtyi  =  rxi, 
y2S  =  X2r'  and  y2  £  X* . 

As  u/  G  Z{Mi-i)  and  |u;'|  <  r, 

vw'  G  S’r(M,_i)  and  w'v'  G  PriMi-i) 

for  some  v  G  v'  G  P(Mi-i)  (see  Figure  10). 

We  have  a  <  \tyi\  <  2a  —  1  because  iCi  is  factor  of  yi  and  |u;i|  <  2a-  —  1. 
Hence  a  <  \y2v'\  <  2a  -  I  since  \y2v'\  -  \w'v'\  -  \iyi\,  and  v'  G  P{X*)  by 
induction  hypothesis  (4).  It  follows  that  tyi  has  a  suffix  xi  G  Sa{X*)  and  y2v' 
has  a  prefix  in  Pa{X*).  This  is  impossible  with  respect  to  induction  hypothesis 

(1)  applied  to  vw'v' . 

This  concludes  the  proof. 

(2)  We  only  have  to  give  the  proof  for  words  w  of  Z{Mi)  with  length  |ta|  <  r. 
Let  u  G  S(Mi),  u'  G  -P(M,)  such  that  =  wu'  G  -Pr(Mi)  and  z'  =  uw  e  SjiMi). 
Assume  that  \w\  <  2(7—1,  then  |'u|,  |u'|  >  a.  By  induction  hypothesis  (4),  z'  has 
a.  suffix  in  Sa  (X*)  and  n'  has  a  prefix  in  Pcj{X*).  This  is  impossible  by  (1). 

(3)  Either  ^  G  P{Mi)  —  P{X*)  or  z  =  xw's  with  x  e  X* ,  w'  e  Z{Mi-i)  and 
s  G  P{Mi).  By  induction  hypothesis  (3),  w'  is  either  in  P(X*)  or  has  a  proper 
prefix  in  X*{Z(Mi-2)  nP(X*)). 

(4)  Consequence  of  (2)  and  (3).  □ 

Lemma  9.  Y  is  a  code. 

Proof.  We  show  that  the  monoid  M  constructed  by  the  algorithm  is  stable. 
Let  u,wv,uw,v  G  M.  If  \wv\,  \uw\  >  r,  then  w  G  Z{M)  C  M.  Otherwise,  we 
obtain  the  same  conclusion  with  the  word  w'uwvw'  such  that  w'  G  M  has  length 
|u/|  >  r.  □ 


Lemma  10.  X  CY. 
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Proof.  By  construction,  X  CM.  Assume  that  some  x  CX  belongs  to  y+,  i.e., 

x-yi“-yn  with  yi, . . . ,  2/n  e  ^  and  n>2. 

At  least  one  of  these  words,  say  yi,  is  in  y  \  A  since  A  is  a  code. 

Suppose  that  i  ^  \.  By  Lemma  8,  yi  has  a  prefix  in  Pa{X*).  Take  y  E  A  such 
that  |yyi  •  •  •  yi-i|  >  (t.  Either  yyi  •  •  -yi-i  E  A*  or  yyi  •  •  -yi-i  E  A*(y  \  yY)A*. 
In  both  cases,  this  word  has  a  suffix  in  Sa[X*)  (Lemma  8).  Then  the  word  yx 
of  A*  has  a  factor  in  Sa{X*)Pa{X*).  Due  to  the  letter-synchronization  delay  (r 
of  A,  it  follows  a:  E  A+.  This  is  impossible. 

The  case  i  -  1  is  solved  in  a  similar  way,  by  working  with  y^+i  ■  •  -  yn  instead 

ofyi---yj-i.  ^ 

Lemma  11.  Pr{Y*)A* SriY*)  C  y*. 

Proof.  Immediate  since  Pr(T*)A*5r (y*)  C  Z[M)  C  M  =  Y* .  □ 

Lemmas  9,  10  and  11  together  with  Proposition  4  show  that  y  is  a  complete 
code  with  letter-synchronization  delay  <  r.  The  property  that  Y  is  rational  if 
A  is  rational  is  proved  below.  Consequently,  Theorem  2  is  proved. 

Lemma  12.  If  X  is  rational,  then  Y  is  rational. 

Proof.  It  is  enough  to  show  that  is  rational,  and  the  execution  of  the 

algorithm  needs  a  finite  number  of  passes  trough  the  repeat  instruction. 

The  set  Z[M)  is  composed  of  two  subsets.  The  first  one  equals  Pr{M)A*  H 
A*Sr(M)\X*  which  is  rational  since  Pr{M)  and  Sr{M)  are  finite.  The  second 
one  is  composed  of  some  words  with  length  less  or  equal  to  r.  It  is  therefore 
rational. 

Inside  the  repeat  instruction,  we  have  M'  7^  M  if  the  operation  Z  gives  new 
words  of  length  less  than  r.  Such  words  are  in  finite  number,  showing  that  the 
repeat  instruction  is  executed  finitely  many  times.  □ 
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Abstract.  Classically,  several  properties  and  relations  of  words,  such 
as  ’’being  a  power  of  a  same  word”,  can  be  expressed  by  using  word 
equations.  This  paper  is  devoted  to  study  in  general  the  expressive  power 
of  word  equations.  As  main  results  we  prove  theorems  which  allow  us  to 
show  that  certain  properties  of  words  are  not  expressible  as  components 
of  solutions  of  word  equations.  In  particular,  ’’the  primitiveness”  and 
’’the  equal  length”  are  such  properties,  as  well  as  being  ’’any  word  over 
a  proper  snbalphabet” . 


1  Introduction 

Several  authors  in  the  existing  literature,  cf.  [16],  used  word  equations  in  order 
to  describe  properties  and  relations  of  words,  but,  to  our  knowledge  no  attempt 
to  synthesis  or  of  a  systematization  of  this  topic  has  been  done.  This  was  em¬ 
phasized  also  in  a  recent  survey  [6]  where  some  results  of  the  field  were  collected. 

Classical  relations  on  words  that  are  characterized  as  solutions  sets  of  word 
equations  are  for  instance,  ”two  words  X  and  Y  are  powers  of  a  same  word” 
if  and  only  if  they  constitute  a  solution  of  the  equation  XY  ~YX^  and  ’’two 
words  and  Y  are  conjugates”  if  and  only  if  they  constitute  a  solution  of  the 
equation  XZ  =  ZY .  In  the  first  case  we  need  no  extra  variables,  while  in  the 
second  case  an  additional  variable  seems  to  be  needed.  As  above  we  identify 
names  of  variables  and  particular  solutions  of  an  equation. 

Motivated  by  above,  we  say  that  a  property  of  words  -  either  a  language 
£  C  A’* **  or  a  A;- ary  relation  IZ  C  (17*)^  -  is  expressible  by  a  word  equation,  if 
there  exists  an  equation  e  with  i  >  k  variables  over  X  such  that 
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-  C  coincides  with  the  values  of  a  fixed  component  of  all  solutions  of  e, 


or 


-  TZ.  coincides  with  the  values  of  k  fixed  components  of  all  solutions  of  e. 

Obviously,  languages  are  ^-ary  relations  with  ^  1,  but,  due  to  the  impor¬ 

tance  of  this  particular  case,  we  have  chosen  to  define  those  separately.  We  allow 
e  to  contain  constants  from  E,  An  important  feature  here  is  also  that  t  can  be 
larger  than  k,  i.e.  additional  variables  are  allowed.  This  increases  essentially  the 
expressive  power  of  equations,  and  in  particular  makes  it  much  easier  to  express 
certain  properties  by  equations. 

As  an  illustration  we  recall  the  following.  The  union  of  solution  sets  of  two 
equations  can  be  expressed  as  a  solution  set  of  one  equation,  as  was  shown  in  [4] 
using  4  additional  variables,  and  later  improved  to  require  only  2  additional  ones 
by  [8],  cf.  also  [6].  Similarly,  the  inequality,  that  is  the  set  of  /-tuples  of  words 
which  does  not  satisfy  a  given  equation  e  with  t  variables,  can  be  expressed  as 
a  union  of  the  solution  sets  of  a  finitely  many  equations  each  of  those  using  3 
extra  variables,  cf.  e.g.  [6],  and  consequently  the  inequality  is  expressible  by  one 
equation  if  additional  variables  are  allowed. 

This  way  of  expressing  relations  on  words  using  word  equations  is  very  nat¬ 
ural  and  resembles  the  way  of  expressing  enumerable  relations  on  integers  by 
diophantine  equations.  However,  the  expressive  power  of  our  method  is  weaker. 
Namely,  while  diophantine  equations  can  express  all  recursively  enumerable  sets 
(of  integers),  cf.  [18],  the  word  equations  can  express  only  recursive  relations  on 
words  due  to  Makanin’s  result,  cf.  [17].  And  actually  our  results  show  that  not 
even  all  of  those  can  be  expressed. 

A  central  problem  in  the  study  of  the  expressive  power  of  word  equations  is 
to  show  that  some  relations  are  not  expressible.  A  similar  situation  -  a  need  to 
show  that  certain  languages  are  not  generated  by  a  certain  type  of  devices  -  was 
encountered  at  the  early  stages  of  the  formal  language  theory.  By  now  there  are 
a  lot  of  tools  for  the  latter  problem,  while  there  seems  to  be  none  for  the  former. 

As  the  main  contribution  of  this  paper  we  introduce  such  tools  for  word 
equations.  More  precisely,  we  prove  theorems  resembling  pumping  lemmas  of 
formal  languages,  which  allow  to  prove  the  nonexpressibility.  Very  intuitively, 
we  show  that  if  a  given  equation  defines  a  certain  language,  or,  in  fact,  just  a 
certain  word  of  it  via  a  variable  X,  then  X  actually  contains  some  ’’unfixed  parts” 
which  can  be  filled  arbitrarily,  and  thus  leads  outside  the  considered  language. 

The  contents  of  this  paper  are  summarized  as  follows.  In  the  next  section  we 
state  several  properties  of  words  which  are  expressible  by  equations,  including 
some  closure  properties  of  expressible  languages,  such  as  the  closure  under  cate¬ 
nation,  union  and  Kleene  star  of  a  word.  Most  of  the  material  in  this  section  can 
be  considered  as  a  folklore,  although  we  have  at  least  one  new  proof. 

Then  in  Section  4  we  prove  our  main  results,  namely  tools  for  showing  the 
nonexpressibility.  In  Section  5  we  use  our  theorems  to  show  that  particular  lan¬ 
guages  or  relations,  such  as  ’’the  set  of  primitive  words”,  ’’the  language  (n  U  b)* 
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over  {a,6,c}”  or  ’’the  relation  equal  length”,  are  not  expressible.  As  a  conse¬ 
quence  we  conclude  that  expressible  languages  are  not  closed  under  operations 
of  Kleene  star,  complementation  or  shuffle.  In  Section  6  we  compare  the  family 
of  expressible  languages  to  a  few  much  studied  families,  and  finally,  in  Section  7 
we  state  several  open  problems. 

Due  to  a  limited  space  all  proofs  are  omitted;  a  complete  version  of  the  paper 
can  be  found  in  http://www.tucs.abo.fi/publications/techreports. 


2  On  the  power  of  expressibility 

In  this  section  we  give  —  without  trying  to  be  exhaustive  —  several  examples 
of  properties  of  words  which  are  expressible  as  solutions  of  word  equations  and 
some  closure  properties  of  languages  and  relations.  All  results  presented  here  are 
either  very  simple  or  presented  before,  however,  some  of  those  seem  to  be  not 
very  generally  known,  and  moreover  we  seem  to  have  a  simplified  proof. 

Let  17  be  an  alphabet  of  constants  and  0  be  an  alphabet  of  variables.  We 
assume  that  these  alphabets  are  disjoint.  We  use  the  convention  that  lower  case 
letters  represent  constants  and  capital  letters  represent  variables. 

A  word  equation  is  a  pair  of  words  (uj'y)  6  (17  U  0)*  x  (17  U  0)*  usually 
denoted  by  u  =  r;.  A  size  of  an  equation  is  the  sum  of  lengths  of  u  and  v.  A 
solution  of  a  word  equation  u  =  v  is  a,  morphism  /i  :  (17  U  0)*  17*  such  that 

h{a)  -  a,  for  a  G  17,  and  h{u)  =  h{v).  We  say  that  a  language  L  is  expressible, 
if  there  is  an  equation  e  and  a  variable  X  such  that 

L  =  {h{X)  :  h  is  a  solution  of  e}. 

Similarly,  we  say  that  a  property  TZ  G  (17*)*  is  expressible  by  an  equation  e  if 
there  are  variables  Xi,  .  .  .,  Xjc  such  that 

11  =  {(h(lLi), . . . ,  h{Xk))  -.his  a  solution  of  e}. 

The  property  of  the  expressibility  depends  on  the  sizes  of  the  alphabets  17  and  0. 
In  this  paper  we  concentrate  to  the  case  when  the  alphabet  17  is  finite.  We  also 
assume  that  |17|  >  2.  In  the  case  of  a  unary  alphabet  all  expressible  languages 
are  trivially  regular.  Denote  by  £(17)  the  family  of  expressible  languages  over 
the  alphabet  17. 

Example  1.  The  properties: 

-  W  is  not  square-free,  and 

-  those  words  W  in  {a,  6,  c}*  which  contain  a  letter  c 

are  expressible.  Indeed,  the  former  is  obtained  from  the  equation  W  =  XUUY 
under  the  extra  condition  U  £,  so  that,  by  Theorem  2,  the  whole  property 
can  be  encoded  into  one  equation.  The  latter  one  is  expressed  by  the  equation 
W  XcT. 
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Example  2.  Every  finite  and  co-finite  language  over  a  finite  alphabet  U  is  ex¬ 
pressible.  Indeed,  for  L  =  ..  .,wt}  C  E*,  L  and  E*  -  L  are  expressed  by 

the  formulae 

t 

\JX  =  Wi 

i  =  l 

and 

(  V  X  =  w)ov{  y  X  =  wY), 

w^Wi,\w\<N  |i(;|=iV-f-l 

where  N  =  max{|ti;il  :  i  =  1, .  .  As  above  Theorem  2  makes  it  possible  to 
express  these  formulae  using  only  one  equation. 

Example  3.  The  properties 

-  W  is  imprimitive,  and 

-  W  is  not  minimal  in  its  conjugacy  class  with  respect  to  the  lexicographic 
ordering  ^ 

are  expressible,  too.  This  follows,  as  above,  from  the  formula 
WZ  =  ZW  and  Z  =  VET  and  T  1 


and 

W  =  UV  and  W'  ^  VU  and  fE'  IT 

after  the  observation  that  the  relation  VE'  -<  VE  is  expressible  by  the  formula 
\/{W'  =  RaT  and  W  =  RbT'). 

a-<b 

After  these  examples  we  formulate  several  closure  properties  of  expressible 
languages  and  relations.  Our  first  result  is  very  easy. 

Theorem  1.  The  family  of  expressible  languages  is  closed  under  the  following 
operations:  catenation^  cyclic  closure,  and  Klenee  star  of  a  single  word. 

Our  second  result,  which  we  have  already  used  several  times,  deals  with  the 
closure  properties  under  Boolean  operations. 

Theorem  2.  Let  e  :u  —  v  and  E  :  u'  =  E  he  two  equations.  Then 

1.  A  property  expressible  by  e  and  e'  is  expressible  by  a  single  equation  without 
any  additional  variables. 

2.  A  property  expressible  by  e  or  e'  is  expressible  by  a  single  equation  using  two 
additional  variables. 

3.  The  relation  satisfying  u  ^  v  is  expressible  by  a  single  equation  using  a  finite 
number  of  additional  variables. 
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Theorem  2  deserves  a  few  comments.  First,  we  have  a  new  proof  of  case  2 
which  is  a  simplification  of  the  proof  presented  in  [6]  which,  in  turn,  was  based 
on  ideas  of  S.  Grigorieff  [8].  Second,  it  clearly  gives  more  closure  properties  of 
expressible  languages  and  relations,  such  as 

Corollary  3.  Any  language  or  relation  of  words  expressible  by  a  formula  built 
on  word  equations  using  operations  of  conjunction,  disjunction  and  negation  is 
expressible  by  a  single  equation. 

Third,  with  the  case  3  one  has  to  be  carefull.  It  says  that  the  complement  of 
the  relation  defined  by  an  equation  u  —  v  using  all  variables  of  the  equation 
is  expressible  by  a  single  equation  (using  additional  variables).  This,  however, 
does  not  mean  that  expressible  languages  are  closed  under  the  complementation. 
In  fact,  they  are  not,  as  we  shall  show  in  Section  5.  Of  course,  in  some  special 
cases,  such  a  closure  might  hold. 

VVe  conclude  this  section  by  stating  two  more  closure  properties  of  the  family 
of  expressible  languages. 

Theorem  4.  The  expressible  languages  are  closed  under 

1.  finite  intersections,  and 

2.  finite  unions. 


3  Expressibility  of  languages  by  equations  with  two 
variables 

In  this  section  we  introduce  technical  tools  and  apply  those  to  languages  ex¬ 
pressible  using  only  two  variables.  First,  given  a  vector  z  of  natural  numbers,  we 
define  an  equivalence  relation  T^z  on  positions  in  words  determined  by  solutions 
specified  by  a  vector  z  of  lengths  of  words  constituting  a  solution  of  an  equation. 
The  intuition  behind  the  definition  of  Tlz  is  as  follows.  Consider  a  fixed  equa¬ 
tion  u  —  V,  and  fix  the  lengths  of  the  components  of  a  solution  by  the  vector 
z.  This  fixes  the  lengths  of  both  sides  of  h{u)  =  h(v).  But  this  is  an  identity  in 
so  that  corresponding  positions  on  both  sides  must  be  filled  with  the  same 
letter.  This  induces  via  7lz  the  equivalence  classes  X  above.  These  classes  may 
contain  constants,  i.e.  pairs  of  the  form  (I,  a)  with  a  E  E,  or  unfixed  parts  of 
the  variables,  i.e.  pairs  of  the  form  {i,X)  corresponding  to  the  z-th  letter  of  X. 
Of  course,  in  a  concrete  solution  the  second  components  of  an  equivalence  class 
must  coincide. 

Assume  that  an  equation  e  contains  t  variables  Xi,  X2,  . . .  Xt  and  z  = 
(zi, . . . ,  zi)  is  a  vector  of  t  natural  numbers.  We  say  that  is  a  z-solution  of  e  if 
h  is  a  solution  of  e  and  \h{Xj)  \  =  Zj ,  for  1  <  j  <t.  For  a  vector  z  =  {zi,  . .  . ,  zt) 
we  define  a  function  |  •  |z  •  (0  U  E)*  N  hy 

{Zm  ii  u  =  Xm  E  0, 

1  iiue  E, 

Y^l=i\uk\z  if  u  =  aia2  . .  .as  with  ctj  E  0  U  Z”. 
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In  other  words  \w\z  is  the  length  of  the  word  h(w)  if  h  is  a  z-solution  of  some 
equation. 

Now,  assume  that  we  are  given  an  equation  ui  . .  .Uk  —  vi  . .  .Vs  over  t  vari¬ 
ables  and  a  vector  z  £  N'’’  such  that  \u\z  =  \v\z-  We  define  a  function  leftz' 
\u\z}  —^Nx{&ur)m  the  following  way: 

leftzU)  =  (r,a:)  iff 

\ui  .  .  .liplz  <  i  <  hi  •  -Up+ilz  and  r  =  j  -  \ui  . .  .  Up\z  and  Up+i  =  x 
Similarly,  we  define  the  function  righiz : 

right z(j)  =  {r,x)  iff 

hi  . .  .'yplz  <  i  <  hi  •  •  •'^p+ilz  and  r  =  j  -  hi  •  •  •'^pIz  and  Up+i  =  x 

An  equivalence  relation  Itz  on  positions  {1 . . .  h|z}  is  the  transitive  closure 
of  the  relation  Tv^  defined  by 

in'^j  iff  /e/tz('0  =  rightz{j)  or  leftz{i)  =  leftz{j)  or  rightz{i)  =  rightzU)- 

We  say  that  a  position  i  belongs  to  a  variable  X  if  either  leftz(i)  =  (jj^)  or 
rightz{i)  =  {j,  X),  for  some  j.  Let  X  be  an  equivalence  class  of  the  relation  'R-z- 
We  say  that  X  corresponds  to  a  constant  a  if  there  is  a  position  i  in  X  such  that 
either  leftz{i)  —  (I,«)  or  rightz{i)  —  (!,«)• 

Example  4-  Consider  an  equation  e  :  aXiX^bXi  =  X3X4X3.  Let  z  =  (2,  4,  5,0). 
Then  the  values  of  the  functions  leftz  and  rightz  are  listed  below. 


1 

2 

3 

4 

5 

leftz 

rightz 

(l,a) 

(1,^3) 

(2,^3) 

(2,^1) 

(3,^3) 

(1.^2) 

(4.X3) 

(2,2^2) 

(5,X3) 

6 

7 

8 

9 

10 

leftz 

rightz 

(3,X2) 

(2,^3) 

ihb) 

(3,X3) 

(4.X3) 

(2.2^i) 

(5,X3) 

Then  the  equivalence  classes  of  Hz  are  X  =  {1,6},  y  {3,5,8,10}  and 
Z  =  {2,  4,  7,  9}.  The  equivalence  classes  X  and  y  correspond  to  the  constants  a 
and  b  since  leftz{l)  =  (l,a)  and  leftz{S)  =  {l,b),  respectively.  The  equivalence 
class  Z  does  not  correspond  to  any  constant.  Hence,  the  positions  in  Z  can  be 
filled  with  any  letter  and,  by  case  4  of  Lemma  5,  they  can  be  replaced  by  any 
word  as  well.  This  gives  the  following  family  of  solutions  of  the  equation  e: 

X\  —  j3a,,  X2  ~  (dbctjS^X^  =  u/?6/?6,  —  £, 

where  jS  can  be  replaced  by  any  word. 
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The  above  procedure,  illustrated  in  Example  4,  can  be  seen  as  a  method  of 
filling  the  positions  of  the  variables  in  an  equation.  This  simple  method,  which 
was  first  used  in  [15],  can  be  used,  for  example,  to  give  a  very  illustrative  proof 
for  the  periodicity  theorem  of  Fine  and  Wilf,  cf.  e.g.  [6]. 

Now  the  following  lemma  is  obvious.  Denote  by  w[i]  the  i-th  letter  of  the 
word  w. 

Lemma  5.  Lei  C  be  an  equivalence  class  of  the  relation  1Zz  connected  to  an 
equation  e  \  u  —  v.  Then  the  following  conditions  are  satisfied: 

1.  For  any  two  positions  i,  j  eC  and  a  z-soluiion  h  of  e,  =  h{u)[j]. 

2.  If  C  corresponds  to  a  constant  a,  then  for  each  z-soluiion  h  of  e,  /i(w)[i]  a. 

3.  If  C  corresponds  to  two  different  constants  a  and  b,  then  the  equation  e  has 
no  z-solution. 

If  C  does  not  correspond  to  any  constant  and  e  has  a  z-solution,  then  replac¬ 
ing  the  positions  in  C  by  any  word  produces  a  new  solution  of  e. 

Note,  that  in  case  4  the  new  solutions  obtained  need  not  be  z-solutions  any¬ 
more. 

In  a  formulation  of  our  results  we  need  a  notion  of  a  pattern  language  from  [2] , 
cf.  also  [11].  A  pattern  is  a  word  over  the  alphabet  G  \J  E.  K  pattern  language 
generated  by  a  pattern  w  is  the  set  of  all  words  which  are  morphic  images  of 
■w  under  all  morphisms  h  :  {G  U  E)*  — s-  E*  satisfing  h{a)  —  a,  for  a  in  17.  In 
particular,  it  is  natural  to  denote  by  p{{E*)^)  the  pattern  language  generated 
by  a  pattern  p(Xi,  X2, . .  .,Xk)  containing  k  variables  Xi,  X2,  . .  .  ,Xk-  We  have 
an  obvious  connection: 

Example  5.  Each  pattern  language  is  expressible.  Let  u  be  a  pattern  and  Z  be 
a  variable  which  does  not  occur  in  u.  A  variable  Z  in  equation  Z  =  u  expresses 
the  pattern  language  generated  by  u. 

We  also  need  an  auxilary  lemma  which  follows  rather  straightforwardly  from 
Lemma  5  and  which  holds  for  any  number  of  variables. 

Lemma  6.  Let  L  be  an  expressible  language  via  a  variable  X  in  an  equation  e. 
Suppose  that  there  is  no  one  variable  pattern  p{Y)  such  that  p{E*)  C  L.  Then 
for  each  vector  z  there  is  a  word  w  E.  L  such  that  for  each  z-solution  h  of  e 
h{X)  =  w. 

Now  denoting  by  #l(^)  the  number  of  words  of  length  n  in  the  langauge  L, 
we  are  ready  to  prove  the  main  result  of  this  section. 

Theorem  7.  Let  L  be  an  expressible  language  by  an  equation  on  two  variables. 
Then  either  =  0{n)  or  there  is  a  pattern  p(y’)  with  one  variable  such 

that  p{E*)  C  L. 

As  a  straightforward  consequence  of  Theorem  7  we  obtain  a  gap  theorem  for 
possible  complexities  of  the  function  Note  here  that  for  each  language  L 

we  have  #l(^)  = 

Corollary  8.  Let  L  be  expressible  by  an  equation  with  two  variables.  Then  either 
(7?,)  =  0(n)  or  +  6)  =  for  some  constants  a,  b. 
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4  Main  results 

This  section  is  devoted  to  prove  some  pumping-like  properties  of  expressible  lan¬ 
guages.  These  are  achieved  by  using  the  tools  of  the  previous  section,  and,  more 
importantly,  by  considernig  special  types  of  factorizations  of  words  to  generalize 
a  technique  in  [5],  cf.  also  [14],  which  was  used  to  prove  an  upper  bound  for  an 
index  of  the  periodicity  of  a  minimal  solution  of  a  word  equation. 

We  recall  that  an  T-factorization  of  a  word  w  is  any  sequence  wi,  . ,  Wk.  of 
words  from  a  language  F  such  that  w  =  wi  . .  .Wk^  We  generalize  it  as  follows. 
Let  F  be  a  property  of  sequences  of  words.  We  say  that  a  sequence  lui,  . .  .,  Wk 
is  an  .^^-factorization  of  t/;  if  i/;  =  ici  .  . .  stnd  the  sequence  wi^  . .  Wk  satisfies 
F.  The  factors  wi  and  Wk  are  called  outer  factors  of  w  and  the  other  factors  are 
called  inner  factors  of  w.  Further  we  say  that  a  property  F  defines  synchronizing 
factorizations,  or  briefly  that  F  is  synchronizing,  if  the  following  holds: 

1.  Each  word  admits  a  unique  JF-factorization. 

2.  If  a  word  w  admits  an  /'-factorization  vi,...,Vk  then,  for  each  symbol  a 
in  17  the  word  aw  admits  either  an  /'-factorization  u,v,V2,  .  >  .  where 
uv  -  avi,  or  an  /'-factorization  avi,. .  ,,Vk,  and  the  word  wa  admits  either 
an  /"-factorization  vi, . .  .  ,Vk-\',u,v,  where  uv  —  Vka  or  an  /’-factorization 
vi,...,vka. 

Note  that  our  notion  of  an  /"-factorization  is  connected  to  but  not  the  same  as 
that  of  a  factorization  of  a  free  monoid,  cf.  [3,  16].  These  factorizations  are  used  to 
decompose  free  monoids,  while  in  our  considerations  a  focus  is  on  factorizations 
of  a  single  word.  Note  also  that  the  above  conditions  (1)  and  (2)  could  be  named 
separately;  factorizations  satisfying  (1)  could  be  called  uniquely  deciphering  and 
those  satisfying  (2)  synchronizing.  We  prefered  the  chosen  terminology  since  all 
factorizations  considered  here  satisfy  (1).  Finally  note  that  conditions  (1)  and 
(2)  could  be  defined  with  respect  to  a  language  L:  each  word  of  L  should  satisfy 
these  conditions. 

With  the  above  notions  we  have  the  following  obvious  lemma. 

Lemma  9.  Assume  that  a  property  F  defines  a  synchronizing  factorization  and 
that  xi,  X2,...,Xk  and  yi,  y2,...,yi  are  F -factorizations  of  words  x  and  y, 
respectively.  Then,  if  y  is  a  subword  of  x  and  the  factor  yi  of  y  ends  inside 
factor  Xi  of  x  and  factor  yi  starts  inside  a  factor  xj ,  then  j  —  i  =  I  —  I  and 
2/2  =  +  2/3  =  Xi^2,  ■  .  •,  yi-l  -  Xj-i. 

We  say  that  an  /"-factorization  is  synchronizing  with  a  finite  delay,  if  there 
are  numbers  q,  r  such  that  for  each  word  x  with  an  /"-factorization  xi,  . .  . ,  Xk 
and  each  subword  y  of  x  with  an  /"-factorization  yi,  . .  . ,  yi  if  the  factor  yi  of 

y  ends  in  factor  Xi  of  x  and  the  factor  yi  starts  in  xj,  then  yi  is  a  suffix  of 

— g,l}  •  •  -  Xi  and  yi  is  a  prefix  of  Xj  .  .  •  2;rnin{j+r,A:}- 
Let  rtq,  Wk  he  a,  factorization  of  w.  We  say  that  this  factorization  of 

IV  synchronizes  with  a  pattern  p  iff  p  —  uiU2  . .  .Uk  where,  for  all  i,  either  Ui 

is  a  variable  or  Ui  =  Wi.  Let  njr(w)  be  the  number  of  different  words  in  the 
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factorization  of  w.  For  a  language  L,  denote  nyr(L)  =  max{n:r(u;)  :  w  E  L}. 
Now  we  formulate  the  first  tool  to  show  the  nonexpressibility. 

Theorem  10.  Let  L  be  an  expressible  language  and  T  a  property  definining 
finite  delay  synchronizing  factorizations.  Then  there  exists  a  number  k  such  that 
for  each  w  E  L  satisfying  njr(w)  >  k  there  is  a  pattern  p{Xi, . .  .,Xs)  with 
s  —  nj7{iu)-k  variables  synchronizing  with  a  X -factorization  of  w  and  satisfying 
p{{S*r)CL. 

Next  we  define,  for  each  primitive  word  P,  particular  factorizations  which 
turns  out  to  be  synchronizing.  Let  P  be  a  primitive  word.  Then,  as  is  well-known, 
each  word  w  can  be  uniquely  written  in  the  form  w  =  WiP^^W2  . .  ■ 
where 

-  lUi  does  not  contain  P^  as  a  subword, 

-  P  is  a  proper  prefix  of  Wi,  for  I  <  i  <  k, 

-  P  is  a  proper  suffix  of  Wi,  for  1  <  i  <  k^ 

-  Xj  >  0,  for  1  <  ?  <  ^  —  1. 

These  conditions  clearly  defines  an  instance  of  an  P-factorization,  we  call  it  Pp- 
factorization.  Moreover,  as  is  straightforward  to  see  it  is  synchronizing  and  with 
a  finite  delay.  Next  we  set 

T{u))  —  {xi  :  P"*^'  is  a  factor  in  a  P-factorization  of  w} 

and  define  the  index  of  w  with  respect  to  P,  by  the  formula 

€xpp{w)  —  max{a!j  :  Xi  E  T{w)}. 

Now  we  formulate  our  second  tool  to  show  the  nonexpressibility. 

Theorem  11.  Let  L  be  an  expressible  language  and  P  be  a  primitive  word. 
Then  there  exists  a  natural  number  k  such  that  for  each  word  u  in  L  satisfy¬ 
ing  expp{u)  >  k  there  is  a  word  w  in  L  with  expp(w)  <  k  and  which  is  obtained 
from  u  by  removing  some  occurrences  of  P. 

Theorem  1 1  can  be  used  to  prove  the  following  characterization  of  expressible 
relations  concerning  lengths  of  variables. 

Let  /  be  a  function  f  :  N.  We  say  that  a  property  f{\Xi\,...,\Xr\)  =  0 

is  expressible,  if  the  relation 

Wr)  :  /(kll,  •  krl)  =  0} 

is  expressible. 

Corollary  12.  Let  Xi,  X2,  . . Xr  be  r  different  variables.  If  a  property 

f{\X,\,\X2\,...,\Xr\)  =  Q 

IS  expressible,  then  there  is  a  constant  k  such  that  if  /(ii ,  22,  •  •  •  pV)  =  0,  for 
some  ii,.  . .,  ir,  and  ig  >  k  then  also 

f{ii ,  Z2 , .  .  . ,  1 ,  P,  2*5+1 , .  .  ■  PV)  =  0,  for  some  p<k. 
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5  Applications  of  main  results 

In  this  section  we  apply  our  results  of  the  last  section  to  achieve  our  original 
goal:  to  prove  that  several  very  natural  properties  of  words  are  not  expressible. 
We  recall  that  to  our  knowledge  no  such  result  is  known  in  litterature  except  for 
the  property  ”X  being  a  prefix  of  Y”  that  cannot  be  expressed  without  using 
additional  unknowns,  cf.  [19]. 

Examyhd.  The  language  Li  =  {a”6”  :  n  >  1}  is  not  expressible.  We  prove  it 
by  a  contradiction  applying  Theorem  11,  for  P  =  a.  Let  Ar  be  a  constant  from 
Theorem  11.  Take  a  word  w  =  Since  w  ^  L2  and  expa{w)  >  k  there 

is  a  word  u  in  Li,  which  is  obtained  from  w  by  removing  some  occurrences  of 
the  word  a.  A  contradiction. 

Example  7.  The  property  ”w  is  primitive”  is  not  expressible.  Now  we  can  apply 
Theorem  10.  Let  .iT  be  a  factorization  defined  by  dividing  word  into  blocks  of  the 
same  letters.  Clearly,  P  has  the  synchronizing  property.  Assume  the  property 
is  primitive”  is  expressible  and  let  ^  be  a  constant  from  Theorem  10.  Consider 
a  word  lu  =  a^'^^ba^b . .  ,ab  which  admits  the  factorization  b,  ,  b,  .  . .,  6, 

a,  b.  Since  n:f(w)  =  k-{-2,hy  Theorem  10,  there  is  a  pattern  with  two  variables 
and  one  of  them  corresponds  to  a  factor  of  w  of  the  form  a* .  Since  each  factor 
of  this  form  occurs  in  w  exactly  once,  the  variable  occurs  exactly  once  in  the 
pattern.  The  results  now  follows  from  the  fact  that  the  word  wiXw2  is  a  square 
if  A"  —  W2'wi . 


Example  8.  The  language  L2  =  (aUb)*  over  three-letter  alphabet  S  =  {a,b,c}  is 
not  expressible.  In  the  same  way  as  in  the  previous  example  we  prove  that  if  L2 
is  expressible,  then  there  is  a  pattern  p{X)  such  that  p{X*)  C  L2.  Substituting 
X  =  c  we  obtain  a  contradiction. 

Example  9.  The  relation  and  y  are  of  equal  length”,  i.e. 

J^{{x,y)eir  xS-  ■.\x\-\y\  =  0] 

is  not  expressible.  This  is  due  to  Corollary  12.  Observe  here,  that  the  relation  T 
is  expressible  if  IT"!  =  1. 

As  a  consequence  of  the  above  examples  we  easily  obtain. 

Theorem  13.  The  family  of  expressible  languages  is  not  closed  under  the  oper¬ 
ations  of  complementation,  morphic  image,  inverse  morphic  image  and  shuffle. 

We  conclude  this  section  by  emphasizing  that  the  combination  of  closure 
and  nonclosure  properties  of  expressible  languages,  especially  closure  under  in¬ 
tersection  and  union  and  nonclosure  under  complementation  and  morphic  image, 
makes  the  family  quite  different  from  usually  considered  families  of  formal  lan¬ 
guages. 
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6  Comparisons  with  other  families  of  languages 

We  already  pointed  out  that  the  nonclosure  and  closure  properties  of  C{U) 
makes  this  family  different  from  most  of  the  usually  studied  families  of  languages. 
We  further  emphasize  this  fact  by  the  following  theorem  which  is  proved  by 
considering  particular  languages. 

Theorem  14.  1.  C{E)  is  a  proper  subset  of  the  family  of  recursive  languages 

over  E. 

2.  C(S)  is  incomparable  with  the  families  of  DOL,  regular  and  context-free  lan¬ 
guages. 

7  Concluding  remarks 

As  a  major  contribution  of  this  paper  we  introduced  -  according  to  our  knowl¬ 
edge  -  first  tools  to  show  that  certain  properties  of  words  are  not  expressible  as 
solutions  of  word  equations,  or  more  precisely  as  values  of  some  components  of 
solutions  of  word  equations.  Our  tools  were  based  on  special  factorizations  of 
words,  which  we  called  synchronizing. 

As  applications  of  our  results  several  concrete  properties  of  words  were  shown 
to  be  nonexpressible  by  word  equations,  as  well  as  several  nonclosure  properties 
of  expressible  languages  were  obtained. 

On  the  other  hand,  we  also  stated  many  known  closure  properties  of  express¬ 
ible  languages,  and  in  particular  gave  a  shorter  proof  for  the  fact  that  expressible 
properties  are  closed  under  disjunction. 

Finally,  it  is  worth  mentioning  that  there  remains  a  lot  of  research  to  be 
done  on  this  interesting  and  fundamental  field.  We  point  out  here  just  a  few 
open  problems: 

-  Problem  1.  Is  the  relation  is  a  sparse  subword  (subsequence)  of  v” 
expressible? 

-  Problem  2.  Are  the  properties  is  square-free”  and  is  a  Fibonacci 
word”  expressible?  Recall,  that  Fibonacci  words  are  defined  by  recurrence 
formulae  wq  —  a,  wi  =  b,  Wn+2  —  Wn+iWn,  for  n  >  2. 

-  Problem  3.  When  is  the  complement  of  an  expressible  language  expressible? 

-  Problem  4.  Is  our  gap  theorem  true  for  languages  expressible  by  word 
equations  with  more  than  two  variables? 


References 

1.  Albert,  M.H.,  and  Lawrence,  J.,  A  proof  of  Ehrenfeucht’s  Conjecture,  Theoret. 
Comput.  Sci.  41,  121-123,  1985. 

2.  Angluin  D.,  Finding  pattern  common  to  a  set  of  strings,  in  Proceedings  of 
STOC’79,  130-141,  1979. 

3.  Berstel,  J.,  and  Perrin  D.,  Theory  of  Codes,  Academic  Press,  1985. 


109 


4.  Biichi,  R,  and  Senger,  S.,  Coding  in  the  existential  theory  of  concatenation,  Arch. 
Math.  Logik,  26,  101-106,  1986/87. 

5.  Bnlitko,  V.K.,  Equations  and  inequalities  in  a  free  group  and  a  free  semigroup, 
Till  Gos.  Fed.  Inst.  Ucen.  Zap.  Mat.  Kafedr.  Geometr.  i  Algebra,  2,  242-252,  1970 
(in  Russian). 

6.  Choffrut,  C.,  and  Karhumaki,  J.,  Combinatorics  of  words,  in  G.Rozenberg  and 
A.Salomaa  (eds),  Handbook  of  Formal  Languages,  Springer,  1997. 

7.  Ciilik  II,  K.,  and  Karhumaki,  J.,  Systems  of  equations  and  Ehrenfeucht’s  conjec¬ 
ture,  Discr.  Math.,  43,  139-153,  1983. 

8.  Grigorieff,  S.,  Personal  comunication. 

9.  Giiba,  V.,  The  equivalence  of  infinite  systems  of  equations  in  free  groups  and  free 
semigroups  to  their  finite  subsystems,  Matem.Zametki,  40  (3),  September  1986  (in 
Russian). 

10.  Harrison,  M.A.,  Introduction  to  Formal  Language  Theory,  Addison- Wesley  Pub¬ 
lishing  Company,  1978. 

11.  Jiang  T.,  Salomaa  A.,  Salomaa  K.,  Yu  S.,  Decision  problems  for  patterns,  J.  Corn- 
put.  Sys.  Sciences  50,  53-63,  1995. 

12.  Khmelevski,  Yu.  L,  Solution  of  word  equations  in  three  variables,  Dokl.Akad.Nauk. 
SSSR,  177,  1023-1025,  1967  (in  Russian). 

13.  Khmelevski,  Yu.  I.,  Equations  in  free  semigroups,  Trudy  Mat.  Inst.  Steklov,  107, 
1971  (English  translation:  Proc.  Steklov  Inst,  of  Mathematics  107  /7  977/,  American 
Mathematical  Society,  1976.) 

14.  Koscielski,  A.,  and  Pacholski,  L.,  Complexity  of  Makanin’s  algorithm,  J.  ACM 
43(4),  670-684,  1996. 

15.  Lentin,  A.,  Equations  dans  des  Monoides  Libres,  Gouthiers-Villars,  1972. 

16.  Lothaire,  M.,  Combinatorics  on  Words,  Addison- Wesley,  1993. 

17.  Makanin,  G.S.,  The  problem  of  solvability  of  equations  in  a  free  semigroup.  Mat. 
Sb.,  Vol.  103,(145),  147-233,  1977.  English  transl.  in  Math.  U.S.S.R.  Sb.  Vol  32, 
1977. 

18.  Matijasevich,  Y.,  Enumerable  sets  are  diophantine,  Soviet.  Math.  DokladyW,  354- 
357,  1970.  English  transl.  in  Dokl.  Akad.  Nauk  SSSR  191,  279-282,  1971. 

19.  Seibert,  S.,  Quantifier  hierarchies  and  word  relations,  Springer  LNCS  626,  329-338 
(1992). 


Finite  Loops  Recognize  Exactly  the  Regular  Open 

Languages* * * § 


Martin  Beaudry^  Frangois  Lemieux*  Denis  Therien^ 


Abstract 

In  this  paper,  we  characterize  exactly  the  class  of  languages  that  are 
recognizable  by  finite  loops,  i.e.  by  cancellative  binary  algebras  with  an 
identity.  This  turns  out  to  be  the  well-studied  class  of  regular  open  lan¬ 
guages.  Our  proof  technique  is  interesting  in  itself:  we  generalize  the 
operation  of  block  product  of  monoids,  which  is  so  useful  in  the  associative 
case,  to  the  situation  where  the  left  factor  in  the  product  is  non-associative. 


1  Introduction 

The  algebraic  approach  in  the  study  of  regular  languages,  based  on  consider¬ 
ing  finite  monoids  as  language  recognizers,  certainly  is  the  most  powerful  tool 
available  for  understanding  computations  realized  by  finite-state  automata.  It  has 
developed  into  a  rich  and  coherent  framework  to  relate  combinatorial  descriptions 
of  regular  languages  and  algebraic  properties  of  their  recognizers  [10,  16].  An 
early  example  of  such  relationship  is  the  famous  theorem  of  Schiitzenberger  [24]: 
a  subset  of  A*  is  star-free  (i.e.  can  be  obtained  from  finite  sets  using  Boolean 
operations  and  concatenation)  iff  it  can  be  recognized  by  a  group-free  monoid 
(i.e.  in  which  no  subset  forms  a  non-trivial  group). 

In  much  the  same  way  that  monoids  can  be  used  to  recognize  languages, 
one  may  also  consider  other  types  of  algebras  and  study  their  computational 
power.  For  example,  non-associative  binary  algebras,  usually  called  groupoids, 
are  exactly  the  recognizers  needed  for  context-free  languages:  this  relationship 
has  been  well-known  in  theory  of  tree  automata  (see  [12])  and  can  be  traced 
back  to  the  paper  of  Mezei  and  Wright  [15].  It  has  also  been  used  in  complexity 
theoretic  work  (e.g.  see  [26,  5]).  In  view  of  this  connection,  it  is  natural  to 
try  to  characterize  the  languages  recognized  by  various  specific  subclasses  of 
finite  groupoids.  One  such  class,  that  has  been  extensively  studied  in  the  past 
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[7,  8,  1,  6],  consists  of  loops,  i.e.  groiipoids  with  an  identity  and  for  which  every 
row  and  every  column  of  the  multiplication  table  contains  every  element.  In  [9]  it 
was  proved  that  any  language  recognized  by  a  finite  loop  must  be  regular.  The 
main  result  of  our  paper  gives  an  exact  characterization  of  which  languages  can 
be  recognized  by  loops. 

The  answer  is  surprizing  and  elegant:  a  language  L  C  A*  can  be  recognized 
by  a  finite  loop  iff  L  is  regular  and  open  in  the  group  topology  on  A*.  This 
topology,  introduced  by  [21,  22],  is  the  smallest  one  such  that  every  morphism 
from  A*  onto  a  finite  group  is  continuous;  investigations  of  its  properties  were 
motivated  early  on  by  several  deep  connections  with  important  questions  about 
finite  monoids  [17,  18].  Our  result  thus  adds  on  a  new  perspective  to  a  class  of 
languages  which  already  has  a  significant  history. 

The  paper  is  organized  as  follows:  in  section  2,  we  introduce  most  of  the 
relevant  definitions  that  will  be  needed.  In  section  3,  we  present  some  tools 
that  are  useful  in  constructing  loops  to  recognize  languages.  In  section  4,  we 
show  that  loops  recognize  only  regular  open  languages.  In  section  5,  we  prove 
that  every  regular  open  language  is  recognizable  by  a  finite  loop.  We  derive  some 
consequences  of  this  theorem  in  the  last  section  where  we  also  present  some  ideas 
for  further  applications  of  our  techniques. 


2  Preliminaries 

In  this  section,  we  introduce  our  notation  and  review  some  elementary  facts  about 
monoids  and  groupoids. 

Let  A  be  a  finite  set:  we  write  A*  for  the  free  monoid  generated  by  A, 
i.e.  for  the  set  of  all  words  of  finite  length  over  the  alphabet  A,  concatenation 
being  the  associative  operation.  The  length  of  a  word  x  is  denoted  by  |a^|  and  e 
stands  for  the  unique  word  of  length  0,  which  is  the  identity  element  of  the  free 
monoid.  A  congruence  on  A*  is  an  equivalence  relation  a  that  is  compatible  with 
concatenation,  i.e.  xi  a  yi  and  X2  a  y2  imply  xiX2  oc  yiy2-  The  quotient  A*  ja 
is  then  an  A-generated  monoid,  and  every  A-generated  monoid  is  of  this  form. 
A  language  L  C  A*  is  recognized  by  the  monoid  M  iff  there  exist  a  morphism 
(j)  :  A*  M  and  a  subset  F  of  M  such  that  L  =  {x  E  A*  :  <f){x)  G  F}; 
equivalently  we  can  view  the  morphism  as  going  from  A*  to  M* ,  transforming 
a  word  x  into  a  string  of  monoid  elements  which  is  then  evaluated  in  M;  in 
this  point  of  view,  only  alphabetical  morphisms  need  to  be  considered,  that  is 
morphisms  mapping  letters  to  letters.  We  observe  that  when  L  is  recognized 
by  M ,  </j(A*)  is  a  submonoid  of  M  isomorphic  to  A* /a  for  some  congruence  a 
and  L  is  a  union  of  a-classes.  It  is  well-known  that  a  language  is  regular  iff  it 
can  be  recognized  by  a  finite  monoid,  i.e.  iff  it  is  a  union  of  a-classes  for  some 
congruence  a  of  finite  index.  We  will  say  that  L  C  A*  is  a  group  language  iff  L 
can  be  recognized  by  a  finite  group. 

The  notions  above  have  natural  counterparts  in  the  non-associative  world. 
A  groupoid  is  given  by  a  set  and  a  binary  operation:  we  will  assume  here  that 
every  groupoid  contains  a  2-sided  identity  element.  The  free  groupoid  generated 
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by  .4  will  be  denoted  by  .  It  will  be  convenient  to  think  of  an  element  in 
as  a  pair  {t,x)  where  a:  is  a  word  in  the  free  monoid  and  i  is  a  rooted 
binary  tree  with  |.x’|  leaves;  in  particular,  the  identity  of  the  free  groupoid  is  then 
(0,e).  The  product  of  with  (^2,2:2)  is  then  the  pair  (^1^2,  2^12^2)  where 

/1/2  =  ti  if  ^2  is  empty,  ^1^2  =  h  if  is  empty,  and  otherwise  tit2  =  t  is  the 
tree  obtained  by  joining  the  root  of  ti  and  the  root  of  ^2  fo  a  new  node,  which 
becomes  the  root  of  t.  g  €  is  identified  in  this  way  with  a  pair 
we  define  Tree  (^)  ==  t  and  Yield  (^)  =  x.  We  say  that  gi,g2  ^  ^re  yield- 
eqmvalent  if  Yield  (^1)  =  Yield  (^2)-  We  can  view  each  row  and  each  column  in 
the  multiplication  table  of  the  groupoid  G  as  defining  a  mapping  from  G  to  G. 
The  closure  of  these  mappings  under  the  operation  of  composition  is  called  the 
multiplicative  monoid  of  G,  denoted  by  J\A[G). 

Congruences  on  are  defined  in  the  same  way  as  in  the  associative  case.  If 
a  is  a  congruence  on  A^*\  the  quotient  A^*^ /ex.  is  an  A-generated  groupoid,  and 
every  T-generated  groupoid  arises  in  this  way.  A  loop  G  is  a  groupoid  whose 
multiplication  table  contains  every  element  in  each  row  and  in  each  column; 
clearly,  this  will  be  the  case  iff  M{G)  is  a  group.  Equivalently,  a  loop  is  a 
groupoid  that  is  left  and  right  cancellative,  i.e.  ab  —  ac  implies  b  =  c  and 
ba  =  ca  implies  b  —  c,  for  any  a,b,c  E  G. 

We  now  wish  to  use  groupoids  to  recognize  subsets  of  A*]  note  that  if  G 
is  not  associative,  the  notion  of  a  morphism  from  A*  to  G  is  not  well-defined. 
We  say  that  the  language  L  C  A*  is  recognized  by  the  groupoid  G  if  there 
exist  an  alphabetic  morphism  (f)  :  A*  G*  and  a  subset  jP  of  G  such  that 
L  =  {x  E  A'‘  :  G{(l){x))  n  T  ^  0},  where  G{(j)(x))  is  the  set  of  elements  of  G 
obtained  by  evaluating  the  string  <l)(x)  of  G*  in  all  possible  ways.  Note  that  if 
G  is  associative,  there  is  only  one  way  of  evaluating  (l){x)  and  we  are  back  to  the 
definition  given  for  monoids.  We  will  say  that  T  C  A*  is  a  loop  language  iff  L  can 
be  recognized  by  a  loop.  In  terms  of  congruences,  the  groupoid  G  recognizes  the 
language  L  iff  there  exist  an  A-generated  subgroupoid  of  G  isomorphic  to  A^*^ / a 
and  a  subset  F  of  this  subgroupoid  such  that  x  E  L  if^  there  is  some  tree  t  such 
that  [{t,  ;r)]a  is  in  F.  One  pleasant  feature  of  this  notion  of  language  recognition 
is 

Lemma  2.1  [5]  L  is  recognizable  by  a  finite  groupoid  iff  L  is  context-free. 

The  finite  group  topology  on  A*  is  the  smallest  topology  such  that  every 
morphism  from  A*  onto  a  finite  group  is  continuous.  It  is  equivalent  to  say 
that  the  group  languages  form  a  basis  for  this  topology.  It  was  first  introduced 
by  Hall  [13]  for  the  free  group,  and  by  Reutenauer  for  the  free  monoid  [21,  22]. 
Connections  were  soon  discovered  between  some  classical  problems  about  finite 
monoids  and  computing  the  closure  of  a  given  regular  language  for  this  topology 
[17,  18];  it  thus  became  an  important  question  to  characterize  which  regular 
languages  are  open  or  closed.  A  sequence  of  deep  results  [2,  3]  finally  led  to  the 
following  combinatorial  characterization  for  the  regular  open  sets  [19]. 

Lemma  2.2  A  regular  language  is  open  iff  it  is  a  finite  union  of  languages  of  the 
form  LoO-iL-i  . . .  a^Lk  luhere  the  o-i ’s  are  letters  and  the  Li 's  are  group  languages. 
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3  Recognizing  languages  with  loops 

The  aim  of  this  section  is  to  prove  that  any  language  of  the  form  B^aiB”  ■  ••B*akB’‘ , 
where  B  is  a  finite  alphabet  and  ai  e  B,  can  be  recognized  by  a  finite  loop.  This 
result  will  be  of  great  help  in  proving  Theorem  5.3. 

In  general,  it  is  not  an  easy  task  to  construct  directly  a  loop  B  that  recognizes 
a  given  language  L.  What  can  be  done  instead,  is  to  construct  a  partially  defined 
loop  G  that  recognizes  L,  and  then,  embed  G  into  a  loop  B,  This  motivates  the 
following  definition. 

A  groupoid  G  with  an  absorbing  element,  denoted  0,  is  xueaMy  cancellative  if 
for  any  a,  z,  y  E  G,  the  two  properties  [ax  =  ay  ^  ^  [x  —  y)  and  [xa  =  ya  ^ 

0)  ^  [x  —  y)  are  satisfied. 

The  Cayley  table  of  a  weakly  cancellative  groupoid  is  such  that  in  each  row 
and  each  column  no  nonzero  element  appears  twice.  Hence,  the  nonzero  elements 
of  such  a  groupoid  form  a  partially  defined  groupoid  which  we  call  an  incomplete 
loop.  This  terminology  is  justified  by  the  following  lemma. 

Lemma  3.1  ([11])  An  incomplete  loop  containing  n  elements  can  be  embedded 
in  a  loop  containing  t  elements,  for  any  t  >2n.  ^ 

Recall  that  if  Q  is  a  loop  and  rn  E  Q+  then  Q{w)  is  the  set  of  elements  that 
result  from  evaluating  w  using  all  possible  bracketings. 

Lemma  3.2  Let  Q  be  a  loop  and  let  u,v,w  E  ■  Then,  the  cardinality  of 
Q{uwv)  IS  at  least  as  large  as  that  of  Q{w). 

Weakly  cancellative  groupoids  will  be  useful  to  prove  that  a  language  can  be 
recognized  by  a  loop.  This  is  a  consequence  of  the  following  lemma. 

Lemma  3.3  Any  language  recognized  by  a  weakly  cancellative  groupoid,  with  0 
in  the  accepting  set,  is  also  recognized  by  a  loop. 

Proof.  Let  G  be  a  weakly  cancellative  groupoid,  and  let  L  C  G*  be  a 
language  recognized  by  G.  Assume  that  0  belongs  to  the  accepting  set.  Let 
B  =  G  —  {0},  let  B^*^  be  the  free  groupoid  over  the  set  B,  and  let  /?  be  the 
cardinality  of  R.  We  also  denote  by  B  the  incomplete  loop  induced  by  the 
elements  of  B  in  G. 

We  will  define  a  sequence  of  incomplete  loops  Bi,  for  i  >  0.  Let  Bo  =  B  and 
define  Bi+i  from  Bi  as  follows.  All  products  defined  in  Bi  are  defined  identically 
in  Bi+i.  Moreover,  for  any  undefined  product  a  •  b  in  Bi,  we  define  a  -  b  =  [ab) 
in  Bi^i 

Remark.  Observe  that  for  any  a,b  e  B,  if  the  product  ab  is  not  defined  in 
Bk,  then  c  —  ab  is  a  new  element  in  Bk+i-  Moreover,  for  any  d  E  5a-+i> 
products  cd  and  dc  are  not  defined  in  B^+i-  Those  products  generate  two  new 
elements  in  Bh+2,  and  so  on.  This  and  Lemma  3.2  imply  that  for  any  u,  v  e  B* 
such  that  k  —  |«|  +  |t;|,  Bk-\-i(uabv}  contains  at  least  k  elements. 
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Let  k  -  j3+ 2  and  let  Bk  be  embedded  in  a  finite  loop  H.  We  will  argue  that 
L  is  recognized  by  H  with  the  accepting  set  containing  all  nonzero  elements  of 
the  accepting  set  of  G  plus  all  elements  not  in  B. 

If  G  B*  can  be  evaluated  to  a  nonzero  element  in  G,  then  w  can  be  evaluated 
to  the  same  element  in  H  using  the  same  parenthesization.  This  shows  that  if 
w  G  B*  cannot  be  evaluated  to  0  in  G,  then  w  is  accepted  by  G  if  and  only  if  it 
is  accepted  by  H . 

Suppose  that  w  can  be  evaluated  to  0  in  G.  Then,  there  exists  a  segment  u 
of  iv  of  minimal  length  that  can  be  evaluated  to  0,  i.e.  w  ^  sut,  0  G  G{ii)  and 
for  any  strict  segment  v  of  u,  0  ^  ^(^)-  So,  there  exist  ui,U2  E  B^  and  a,b  G  B 
such  that  u  =  uiU2,  a  E  G(^/.i),  b  G  G(u2)  and  ab  —  0  in  G,  but  a  /  0  and 
b  ^  0.  This  implies  that  w  can  be  partially  evaluated  to  sabt  both  in  G  and  in 
H .  Now,  there  are  two  possibilities.  First,  if  |s|  +  |^|  <  k,  then  s[ab)t  can  only  be 
evaluated,  in  H ,  to  an  element  in  B^*'^  —  B\  in  this  case  H  accepts  w.  Otherwise, 
by  the  above  remark,  H{w)  contains  at  least  (3  -\-l  different  elements,  and  so,  at 
least  one  of  them  is  not  in  B.  Thus,  H  accepts  w  if  and  only  if  G  accepts  w.  □ 

As  an  example  of  application  of  Lemma  3.3,  we  can  show  that  any  cofinite 
language  is  recognized  by  a  finite  loop.  Since  it  is  easy  to  see  that  no  finite 
language  can  be  recognized  by  a  finite  loop,  the  class  of  loop  languages  is  not 
closed  under  complement. 

Loops  can  also  recognize  languages  that  are  not  cofinite  and  are  not  recog¬ 
nized  by  a  group.  A  simple  example  is  the  set  OR  C  {0,  1}*,  composed  of  all 
words  that  contain  at  least  one  1.  This  language  is  recognized  by  f7i,  the  mon¬ 
oid  defined  by  00  =  0  and  01  =  10  =  11  =  1.  Here,  0  is  an  identity  and  1 
is  absorbing.  Since  Ui  is  a  weakly  cancellative  groupoid,  the  language  OR  can 
be  recognized  by  a  finite  loop.  It  is  easy  to  verify  that  the  complement  of  OR 
cannot  be  recognized  by  any  finite  loop. 

We  close  this  section  with  a  lemma  that  will  play  an  important  role  in  the 
proof  of  Theorem  5.3. 

Lemma  3.4  Let  A  be  a  finite  alphabet  and  let  ai, .  .  .  ,ak  be  elements  of  A  (k  > 
0 ).  Then  Lk  =  A*aiA*  ■  ■  •A*akA*  is  recognized  by  a  finite  loop. 

Proof.  The  proof  is  by  induction  on  k.  Observe  first  that  Li  =  A*aiA* 
can  be  recognized  by  the  weakly  cancellative  groupoid  Ui  discussed  above.  By 
Lemma  3.3,  Li  can  also  be  recognized  by  a  finite  loop. 

Let  k  =  where  i,j  >  0.  Then,  there  exists  a  finite  loop  Qi  that  recognizes 
Li  =  A*aiA*  ■  ■  ■  A*aiA*  with  the  accepting  set  Fi  C  Qi  and  there  exists  a 
finite  loop  Qj  that  recognizes  Lj  =  A*ai.^iA*  •  •  -A^akA*  with  the  accepting  set 
FjCQj. 

Consider  the  weakly  cancellative  groupoid  Q  defined  as  the  loop  Qi  x  Qj 
except  that  (a,6)(c,  d)  =  0  whenever  a  E  Fi  and  d  G  Fj.  Then,  Q  recognizes 
the  language  =  A* a^A*  •  •  ■  A* a^A*  with  0  as  the  accepting  element.  By 
Lemma  3.3,  L  is  also  recognized  by  a  finite  loop.  □ 
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4  Finite  loops  recognize  only  open  regular  lan¬ 
guages 

In  [9],  it  is  shown  that  finite  loops  only  recognize  regular  languages.  In  this 
section  we  refine  this  result  by  showing  that  only  open  regular  languages  can  be 
recognized  by  such  algebras.  The  following  can  be  observed. 

Lemma  4.1  Any  language  L  C  A*  of  the  form  Lo--Lk,  where  Li  is  recognized 
by  a  finite  group,  is  open. 

To  prove  the  next  theorem,  we  will  use  the  following  definition.  Let  A  be  an 
alphabet  and  S  a  set  of  variables.  A  special  tree  t  over  A  with  variables  in  S  is 
a  binary  tree  where  each  element  of  S  appears  exactly  once  as  a  label  of  a  leaf. 
We  will  use  special  trees  in  two  particular  situations:  when  S  contains  a  single 
variable  A";  and  when  each  leaf  of  t  is  labeled  with  a  variable  in  S.  In  this  last 
case  we  say  that  i  =  t(xi,  .  ,  . ,  Xn)  is  a  special  tree  with  n  leaves. 

Let  it  be  a  special  tree  over  A  with  the  variable  X  and  let  f  be  any  tree.  We 
denote  by  t  •  t'  the  tree  obtained  when  the  leaf  in  t  labeled  with  X  is  replaced  by 
f .  Observe  that  when  t'  is  also  a  special  tree  with  variable  X ,  the  result  is  still 
a  special  tree  with  variable  A. 

Observe  also  that  •  is  an  associative  operation.  Hence,  for  any  special  trees 

ti, . .  .,tk  over  Q  with  variable  A,  the  expression  •  ^2 . defines  the  same 

special  tree  no  matter  which  parenthesization  is  used.  This  will  be  denoted  by 

n-i'- 

Similarly,  if  s(;ri , .  . . ,  .Cn)  is  a  special  tree  with  n  leaves  and  are 

arbitrary  trees,  then  s(ti, . . .  fin)  is  the  tree  obtained  by  substituting  the  tree  ti 
for  the  leaf  labeled  with  Xi,  for  all  i. 

Theorem  4.2  Finite  loops  recognize  only  open  regular  languages. 

Proof.  We  will  use  the  technique  of  [9]. 

Let  Q  be  a  finite  loop.  We  define  a  comb  over  Q  recursively  as  follows.  Any 
a  eQU  {e}  is  a  comb,  li  a  e  Q  and  u  G  Q^*'>  is  a  comb  then  w  =  (au)  is  also  a 
comb.  No  other  element  of  is  a  comb.  Hence,  a  comb  c  G  corresponds 
to  the  left-to-right  bracketing  of  Yield  (c). 

Any  t  G  can  be  decomposed  into  t  —  s(ti,...fin),  where  n  >  1, 

.s(a^i , .  .  . ,  is  a  special  tree  with  n  leaves,  and  ti  is  a  comb  over  Q.  Let 
comb(i)  be  the  smallest  n  for  which  such  a  decomposition  exists. 

We  will  show  that,  for  any  tree  t  G  ,  there  exists  a  yield-equivalent  tree 
s  G  evaluating  to  the  same  element  and  such  that  comb(s)  is  bounded  by  a 
constant.  By  Lemma  4.1,  this  will  prove  the  theorem  because  the  set  of  words  in 
Q*  that,  left-to-right  evaluate  to  a  given  element  forms  a  language  recognized  by 
the  multiplication  group  of  Q. 

More  precisely,  we  will  show  that  for  any  tree  t  G  Q^*'^  such  that  comb(^)  >  8^, 
where  q  is  the  order  of  Q,  we  can  find  a  yield-equivalent  tree  F  G  evaluating 
to  the  same  element  as  t  and  such  that  comb(f^)  <  comb(t). 
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vSuppose  that  t  G  is  such  that  comb(i)  =  n  >  ,  and  let  t  —  s[ti , .  .  . ,  tn) 

be  decomposed  as  explained  above.  Since  s  has  more  than  8^  leaves,  it  must 
possess  a  path  of  length  k  >  Zq.  Let  the  nodes  on  this  path  be  do,  di, . . . ,  d/,, 

where  do  is  the  root  of  s  and  is  a  child  of  di. 

For  0  <  /  <  g,  let  Si  be  the  tree  rooted  at  dsi.  Moreover,  for  0  <  z  <  g, 
let  Vi  be  the  special  tree  constructed  from  Si  by  substituting  the  variable  X  for 
S/+1.  Hence,  for  each  Vi  there  exist  four  indices  1  <  <  jdi  <  7*  <  6i  <  n 

such  that  the  leaves  at  the  left  of  X  in  Vi  are  labeled  with  •  •  • ,  and  those 
on  its  right  are  labeled  with  .'r-y, , . . . ,  x^^.  Moreover,  the  leaves  of  Sq  are  labeled 

with  ,  X(3^  for  some  1  <  <  /?g  <  n.  We  have  s  =  t;,)  •  Sg,  where 

Vi  zz  ViP^Xoci  ,■•••>  ,  X,  X^^  ,  •  •  .  ,  XSi),  and  Sq  =  Sq  [Xa^  Xp^)- 

We  can  thus  write:  t  =  (11^=0  where  Zi  =  . . . ,  X,  . . . , 

and  [zq  -  Sq{ta^,.  •  • 

Let  Wi  -  Yield  (L)  and  define  h  to  be  the  comb  whose  yield  is  w^.  -  -wp^,  and 
Vi  the  comb  whose  yield  is  •  •  -uzj, .  Then  Zi  —  ((/iY)rf)  is  yield  equivalent  to 
Zi.  Using  the  fact  that  our  loop  is  both  cancellative  and  finite,  it  is  easily  verified 
(Lemma  7  of  [9])  that  there  exist  two  integers  a  and  b  such  that  t  and  t'  evaluate  to 
the  same  element,  where  t'  is  defined  as  t'  = 

We  observe  that  comb(2:i)  >  3  while  comb(2i)  <  2.  This  implies  that  comb(t^)  < 
comb(^),  proving  the  theorem. 

□ 


5  Every  regular  open  language  is  recognized  by 
a  finite  loop 

In  this  section,  we  will  conclude  the  proof  of  our  main  result  by  establishing  the 
converse  of  Theorem  4.2.  In  order  to  do  so,  we  will  introduce  the  block  product  of  a 
monoid  with  a  groupoid.  If  the  second  operand  is  also  a  monoid,  our  construction 
reduces  to  the  known  notion  of  block  product  applied  to  associative  structures 
[23,  25],  which  have  proved  itself  to  be  extremely  useful  as  a  decomposition 
tool  for  finite  monoids.  Note  that  the  block  product  is  a  two-sided  version  of 
the  classical  notion  of  wreath  product:  our  construction  below  can  be  trivially 
modified  to  define  the  wreath  product  of  a  monoid  with  a  groupoid.  Actually, 
the  wreath  product  is  sufficient  to  prove  our  main  result.  We  choose  to  give  the 
more  general  construction  as  the  potential  for  future  applications  seems  more 
important. 

Let  Q  be  a  finite-index  congruence  on  A*,  let  B  be  the  finite  alphabet  A* /ax 
A  X  A* /a  and  let  /?  be  a  finite-index  congruence  on  the  free  groupoid 
For  any  u^v  in  A*,  we  define  the  mapping  Ou^y  :  A*  —¥  B*  by  9u,v{^)  —  ^  ^i^cl 
0^,^y(al  .  .  .  a-n)  ~bi  .  ..bn,  where  bi  ~  ([wai  . .  ai,  [ai+i  .  •  ■anv]a)^ 

We  now  define  a  binary  relation  on 
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{t,  x)  pBa  {s,  y)  iff  1)  x  ay  and 

2)  (tju,v{x))  P  {s,Ou,v(y))  for  all 

Lemma  5.1  /?□  a  is  a  congruence  of  finite  index  on 

Proof.  That  it  is  an  equivalence  of  finite  index  is  easily  checked.  Suppose 
now  that  (f  i ,  xq )  /?  □  a  (si ,  yi )  and  (f  2 ,  X2)  °  cv  (s2 , 2/2) ;  we  want  to  show  that 

(f  1^2,  X1X2) /?□  a  (51S2, 2/12/2)- 

Since  Q'  is  a  congruence,  we  have  that  xiX2ayiy2-  Fix  now  u  and  v  arbitrarily 
in  .4* ;  we  have 

(ilt2Ju,v{30lX2))  =  {ti  5  0U,X2V  {xi)){t2,6nruv{X2)) 

^uyi,v  (^2)) 

/?  (^1  (  ^U,t/2t/ (2/1  ))(^2i  ^uyi,v  (2/2)) 

=  (51^2,  (2/1^2))- 


□ 


The  next  lemma  says  that  the  cancellation  properties  are  preserved  by  the 
block  product. 

Lemma  5.2  If  A* /a  is  a  group  and  /  P  is  a  loop,  then  A^*'>/P^a  is  a  loop. 

Proof.  It  is  clear  that  the  Da-class  containing  the  identity  of  is 
an  identity  for  the  groupoid  A^*'^/pna.  We  next  show  that  /?  □  a  is  a  left- 
cancellative  congruence,  i.e.  {t,  x){s,  y)pna  (t,  x){q,  z)  implies  (s,  y) /?  □  a  (q,  z). 

The  hypothesis  says  that  [is,  xy)  pUa  [tq,  xz)]  hence  xy  a  xz  and  because  a 
is  a  group  congruence,  hence  left-cancellative,  we  deduce  y  a  z. 

Consider  some  arbitrary  u  and  v  in  :  we  now  need  to  show  that 
(sJu,v(y))P{Q,^u,v(z)).  Choose  w  e  A*  such  that  wxau  (such  w  exists  since  a  is 
a  group  congruence).  The  hypothesis  implies  that  {ts,6^^y{xy))  /?  [tq,0w,v[xz)), 
hence  (/,  ^u,,y.;(x))(s,  ^^.^,^2/))  P  {t,0w,zv(x))(q,O,y^^4z)).  Since  y  and  z  are  a- 
congruent,  we  have  (t,9w,yv{x))  =  (t,0w,zv{x)).  Since  /?  is  a  left  cancellative 
congruence,  we  infer  (s,  [y))  p  [q,0y,^^y(z)),  i.e.  (s,  (1/))  P  (q,0u^v{z))- 

Hence  (s,  y)  pn  a  (q,  z). 

By  symmetry,  we  get  that  /?  □  a  is  right  cancellative  as  well,  so  that  A^*^ / P  □  a 
is  a  loop.  ^ 

We  are  now  ready  to  complete  the  proof  of  our  main  result. 

Theorem  5.3  If  L  C  A*  is  a  regular  open  language,  then  L  can  be  recognized 
by  a  finite  loop. 
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Proof.  Suppose  that  L  is  an  open  regular  language;  by  Theorem  2.2,  L  is  a 
finite  union  of  languages  of  the  form  L^aiLi  . .  .  cikLu ;  where  each  Li  is  recognized 
by  a  group.  Using  the  classical  construction  for  the  associative  case,  it  is  readily 
verified  that  the  class  of  loop  languages  is  closed  under  union,  so  it  suffices  to 
prove  that  any  language  of  the  above  form  is  recognized  by  a  loop.  If  A?  =  0,  the 
claim  is  clearly  true. 

Let  now  k  >  1;  without  loss  of  generality  we  can  assume  that  all  Li’s  are 
recognized  by  the  same  group  G,  e.g.  by  taking  the  direct  product  of  the  syntactic 
monoid  of  each  language  Li.  Let  G  =  so  that  each  Li  is  a  union  of  a- 

classes;  since  concatenation  distributes  over  union,  and  using  once  more  closure 
under  union,  it  suffices  to  consider  the  case  where  each  Li  is  a  single  class  of  the 
congruence,  i.e.  Li  ==  for  some  Ui  in  A*.  Let  B  =  A* /a  x  A  x  A* / a  and 
H  -  /P  be  the  loop  recognizing  the  language  B*biB*  . .  .B*bkB* ,  as  given 
by  lemma  3.4,  where  bi  =  ([wofliwi  •  •  [uiai+iUi+i  . .  .w;c]a)-  We  claim 

that  L  is  recognized  by  the  loop  A^*^/pOa;  in  fact  we  will  show  that  a:  G  L  iff 
there  is  some  tree  t  with  |a:|  leaves  such  that  [(t,  is  an  accepting  element 

of//. 

Let  first  x  be  in  L;  thus  x  -  aroOia^i  •  •  -cikXk,  with  Xi  a  Ui  for  each  i.  Thus 
6t:^Px)  is  in  B*biB*  . .  .b^B*,  hence  for  some  tree  t,  [(U  is  an  accepting 

element  of  H . 

Conversely,  suppose  a:  G  A*  is  such  that,  for  some  tree  t,  [(U  is  an 

accepting  element  of  H.  Thus  =  Vobiyi  . .  -  bkVk^  where 

bi  =  ([u-oai  ui  . .  .  Uz,  [uiai^iUi^i  .  ..iik]a)-  Therefore  x  -  xqQiXi  .  ..OkXk, 

where  xqQiXi  . ,  .Xi-i  a  uqcei'Wi  . .  •  Uk-i  for  i  =  I, . . . ,  k,  and  also  Xk  ex  Uk-  Using 
the  fact  that  cv  is  a  group  congruence,  we  deduce  Xi  a  Ui  for  each  f,  so  that  x  is 
in  L.  ° 


6  Conclusion 

Our  characterization  has  a  number  of  consequences,  from  the  point  of  view  of 
algebra,  language  theory  and  computational  complexity. 

First,  we  get  a  new  combinatorial  description  of  the  regular  open  languages. 

Corollary  6.1  Any  regular  open  language  is  a  finite  union  of  languages  of  the 
form  L\  .  .  .Lk,  ivhere  each  Li  is  a  group  language. 

Proof.  By  the  proof  of  theorem  4.2,  every  loop  language  is  of  this  form, 
hence  by  Theorem  5.3,  this  is  also  true  for  regular  open  languages.  □ 

It  is  also  appropriate  to  note  the  following  structural  representation  that  we 
get  for  loops.  By  the  proof  of  theorem  4.2,  we  see  that  a  loop  G  recognizes 
only  regular  open  languages  where  the  group  languages  that  are  needed  are 
recognizable  by  the  multiplication  group  of  G.  By  the  proof  of  theorem  5.3,  any 
language  of  the  form  L^aiLi  . . .  UkLk,  where  each  Li  is  recognized  by  the  group 
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A^(G),  can  be  recognized  by  a  loop  of  the  form  where  ~  Ad(G) 

and  j.3  is  the  loop  congruence  induced  by  the  construction  of  Lemma  3.4.  Thus, 
in  some  sense,  computing  over  the  loop  G  is  similar  to  computing  over  the  group 
A4(G),  the  non-associativity  being  taken  care  of  by  the  very  simple  loops  given 
in  3.4.  It  would  be  very  interesting  to  see  to  what  extent  this  phenomenon  holds 
for  groupoids  in  general. 

Another  consequence  of  this  work  is  that  the  computational  complexity  of 
testing  membership  in  a  langnage  recognized  by  a  loop  G  can  be  infered  from  the 
algebraic  structure  of  its  multiplication  group  M(G).  Any  language  L  recognized 
by  G  is  a  finite  union  of  languages  of  the  form  LiL2  -  •  -  where  the  Li's  are 
recognized  by  M(G).  When  M{G)  is  solvable,  each  of  these  languages  is  in 
ACC°  ([4]),  where  ACC°  is  the  class  of  languages  that  are  recognized  by  a 
family  polynomial-size  constant-depth  Boolean  circuits  using  NOT,  AND,  OR, 
and  modular  gates.  In  such  a  case,  it  is  easy  to  see  that  L  is  also  in  ACC°.  This 
shows  the  following  corollary. 

Corollary  6.2  Any  language  recognized  by  a  loop  whose  multiplication  group  is 
solvable  belongs  to  ACC*^. 

Note  that  when  G  is  a  group,  it  can  be  shown  that  M{G)  is  solvable  precisely 
when  G  is  solvable.  Hence,  the  above  result  naturally  fits  in  the  structural  com¬ 
plexity  framework  of  [4] . 

It  is  remarkable  that  non-associative  algebras  such  as  loops  could  be  related 
to  such  natural  class  of  languages  as  the  regular  open  languages.  Our  generaliz¬ 
ation  of  the  block  product  yields  a  loop  decomposition  that  shows  that  absence 
of  associativity  does  not  necessarily  imply  absence  of  structure.  This  is  also 
confirmed  by  other  recent  works,  such  as  [6].  We  strongly  believe  that  a  better 
understanding  of  non-associative  algebras,  in  particular  finite  gronpoids,  could 
have  important  consequences  in  language  theory  and  computational  complexity. 
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Abstract.  We  present  a  PCF-like  calculus  having  real  numbers  as  a 
basic  data  type.  The  calculus  is  defined  by  its  denotational  semantics. 
We  prove  the  universality  of  the  calculus  (i.e.  every  computable  element 
is  definable).  We  address  the  general  problem  of  providing  an  operational 
semantics  to  calculi  for  the  real  numbers.  We  present  a  possible  solution 
based  on  a  new  representation  for  the  real  numbers. 

keywords:  real  number  computability,  domain  theory,  denotational  and 
operational  semantics,  abstract  data  types. 


1  Introduction 

The  aim  of  this  work  is  to  relate  two  different  approaches  to  computability  on 
real  numbers:  a  practical  approach  based  on  programming  languages,  and  a  more 
theoretical  one  based  on  domain  theory.  Several  implementations  of  exact  com¬ 
putations  on  real  numbers  have  been  proposed  so  far  ([BC90],  [MM],  [Vui88]). 
In  these  works,  real  numbers  are  represented  by  programs  generating  sequences 
of  discrete  elements,  e.g.  digits.  On  the  other  hand,  different  theoretical  works 
on  computability  on  real  numbers  are  based  on  domain  theory:  [Lac59,ML70], 
[EE96],  [DG96].  In  all  these  works  domains  of  approximations  for  the  real  num¬ 
bers  are  considered.  A  point  in  these  domains  represents  either  a  real  number  or 
the  approximation  of  a  real  number.  Approximated  reals  are  normally  described 
by  intervals  of  the  real  line. 

The  relation  existing  between  the  two  approaches  is  described  in  several  steps. 
First  we  present  a  domain  of  approximations  which  is  directly  derived  from  a 
representation  for  the  real  number  used  in  some  implementations  of  the  exact 
real  number  computation  ([BC90,MM]).  Prom  this  domain  of  approximations  we 
derive  a  calculus  for  the  real  numbers.  The  calculus  we  present  is  an  extension 
of  PCF  having  the  real  numbers  as  ground  type.  We  call  it  We  define  Cr 
giving  its  denotational  semantics. 

The  next  natural  step  consists  in  giving  an  operational  semantics  to  the  cal¬ 
culus,  possibly  using  the  representation  for  the  real  numbers  we  start  with.  If 
this  would  be  possible,  we  will  have  established  a  close  connection  between  the 

*  Work  partially  supported  by  an  EPSRC  grant:  “Techniques  of  Real  Number  Com¬ 
putation”  at  Imperial  College  of  Science,  Technology  and  Medicine,  London  and  by 
EEC/HCM  Network  “Lambda  Calcul  Type”. 


domain  of  approximations  for  the  real  numbers  and  the  implementations.  We  will 
have  a  calculus  that  is  for  many  aspect  similar  to  the  calculi  used  in  the  imple¬ 
mentations  and  whose  terms  can  be  directly  interpreted  in  the  approximations 
domain.  Unfortunately  we  prove  that  it  is  impossible  to  define  the  operational 
semantics  in  this  way.  We  prove  this  negative  result  in  a  general  manner,  the  im¬ 
possibility  holds  not  only  if  we  consider  the  particular  representation  for  the  real 
numbers  we  chose,  the  domain  of  approximations  obtained  from  it  and  the  cal¬ 
culus  Cr-  The  negative  result  holds  for  a  large  class  of  representations,  domains, 
and  calculi. 

Finally  we  define  an  operational  semantics  for  Lr-  In  order  to  do  this  how¬ 
ever  we  need  to  introduce  a  new  representation  for  the  real  numbers.  This  new 
representation  is  quite  different  from  the  classical  ones,  in  it  real  numbers  can 
be  represented  also  by  sequences  of  digits  undefined  on  some  elements.  In  order 
to  compute  with  this  new  representation  is  absolutely  necessary  to  use  parallel 
operators.  The  use  of  parallel  operators  is  the  price  we  need  to  pay  to  have  a 
faithful  calculus  for  the  real  numbers. 

Acknowledgements;  I  would  like  to  thank  Abbas  Edalat,  Martin  Escardo, 
Peter  Potts  and  Michael  Smith  for  several  discussions  on  the  subject. 

2  Real  Number  Computation  in  PCF 

We  consider  the  following  representation  for  real  numbers: 

Definition  1.  A  real  number  x  is  represented  by  a  computable  sequence  of 
integers  (sq,  .  •  • ,  s*, . . .)  such  that: 

(z)  Vn  .  2sn  -  1  <  5n+i  <  -f  1 


In  this  representation  a  sequence  of  integers  is  used  to  describe  a  sequence  of 
rational  intervals.  The  intervals  in  the  sequence  are  contained  one  into  the  other. 
For  practical  purposes  this  representation  is  quite  convenient.  It  allows  to  reduce 
exact  real  number  computation  to  computation  on  integers.  In  this  way  it  is 
possible  to  exploit  the  implementation  of  integer  arithmetic  already  available  on 
computers.  In  [BCR086]  and  [MM]  a  similar  representation  has  been  used  to 
develop  quite  efficient  algorithms  for  the  arithmetic  operations. 

We  refer  to  [Plo77]  for  a  definition  of  PCF,  In  order  to  represent  real  numbers 
in  PCF  it  is  sufficient  to  translate  in  PCF  the  representation  of  Definition  1.  In 
the  following,  given  a  type  cr,  indicates  the  set  of  closed  terms  in  fLpA-\-3 

having  type  a. 

Definition  2.  A  partial  representation  function  Eval^  :  ^  E  is  defined 

by:  EvaliR(Mi^t)  =  a;  if  there  exists  a  sequence  of  integers  s  such  that: 

(i)  Vn  6  N.  Eval(Mt^tn))  =  Sn\ 

(ii)  Vn .  2s„  -  1  <  Sn+i  <  2s^  -f- 1 

(iii) :C  =  nneN[^."2?^]- 

A  real  number  x  is  said  C-computahle,  if  belongs  to  the  image  of  the  EvalR. 
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We  indicate  with  Ei  the  set  of  the  ^-computable  real  numbers.  The  definition 
of  computability  can  be  extended  to  functions  on  real  numbers. 

Definition  3.  The  function  Eval^  :  (1/  ->  M/)  is  defined  by: 

EvalR(M)  =  /  iff 

Vx  G  M/  .V7V  G  •  EvalR(A^)  =  x  ^  EvakiiMN)  =  f{x). 

A  function  f  :Ri  ^  Mi  is  said  C-computable  if  belongs  to  the  image  of  Eval^. 

It  is  worthwhile  to  observe  that  the  sequential  operators  are  sufficient  to  define 
every  computable  function.  That  is  every  ^-computable  function  on  reals  can  be 
defined  by  a  term  not  containing  the  parallel  test  or  the  existential  quantifier. 
The  form  of  computation  presented  in  this  section,  is  very  similar  to  the  one  used 
in  implementations  of  exact  real  number  computation  and  described  in  [BC90] 
and  in  [MM]. 

3  A  Domain  of  Approximations  for  Real  Numbers 

In  the  literature  there  are  several  approaches  to  computability  on  real  numbers 
which  use  of  domain  theory.  Early  works  in  this  ambit  are  [Lac59],  [ML70], 
and  [Sco70].  In  all  these  approaches  the  real  line  is  embedded  in  a  space  of 
approximations  where  a  notion  of  computability  can  be  defined  in  a  natural  way. 
Many  results  concerning  the  computability  theory  on  real  numbers  are  given  in 
these  contexts.  Here  we  are  going  to  present  a  space  of  approximations  that  is 
similar  in  many  respects  to  the  ones  mentioned  above  but  has  two  important 
differences.  First,  we  base  our  construction  on  the  representation  of  Definition  1. 
As  result  our  space  has  less  approximation  points  and  is  more  closely  related  to 
the  computation  describe  in  [BC90]  and  [MM].  A  second  important  difference 
is  the  following:  our  space  of  approximations  turns  out  to  be  a  Scott-domain. 
The  other  approaches  use  spaces  of  approximations  that  are  continuous  but  not 
algebraic  epos.  The  space  of  approximations  presented  here  has  been  extensively 
studied  in  [DG96].  Here  we  resume  the  main  results  without  giving  the  proofs. 

The  domain  of  approximations  defined  next  is  called  Reals  Domain  {RD).  We 
present  a  construction  of  RD  starting  with  the  integer  sequence  representation 
for  real  numbers.  Let  (si)ieN  be  a  sequence  of  integers  defining  a  real  number  x 
according  to  Definition  1  and  let  (si)i<n  be  an  initial  subsequence.  gives 

partial  information  about  the  value  x.  Examining  {si)icri  we  can  deduce  that 
the  value  x  is  contained  in  an  interval  of  real  numbers. 

Definition  4.  Let  S  be  the  subset  of  sequences  of  integers  defined  by: 

5  =  {(si)i^n  I  Vz  <  n  -  1 . -  1  <  Si+i  <  2si  4- 1}. 

The  function  cj)  from  S  to  the  set  of  rational  intervals  is  defined  by: 

/  /  /  \  \  —  1  -h  1 

(t>({SQ,  Si,  ...  ,  Sn))  = 
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The  set  S  contains  the  “valid”  sequences  of  integers.  The  function  (/>  associates  to 
any  finite  sequence  {si)i<^n  the  interval  [a,  6]  containing  the  real  numbers  that  can 
be  represented  by  sequences  having  as  initial  subsequence  (si)i<n-  The  interval 
[a,b]  represents  the  information  contained  in  the  sequence  (si)i<n. 

Let  {DI,  C)  denote  the  partial  order  formed  by  the  set  of  rational  intervals  in 
the  image  of  the  function  (f).  The  order  relation  C  on  DI  is  the  superset  relation, 
that  is  [a,b]  C  [a',b']  if  [a\b']  C  [a,b]  (if  [a'b']  is  a  more  precise  approximation 
of  a  real  number  that  [a,&]).  The  set  DI  forms  the  base  of  the  domain  RD. 

Definition  5.  Let  RD  be  the  cpo  obtained  by  the  ideal  completion  of  {DI,  C). 

Proposition  6.  RD  is  a  consistently  complete  cj -algebraic  cpo  ( Scott- domain) . 
RD  is  an  effective  Scott-domain  when  we  consider  the  following  enumeration  of 
finite  elements: 

e^(0)  ±  e”(((ni,n2>,n3)  +  1)  =i  [(ni  -  n2  -  (ni  -  n2  + 

Where  (  )  is  an  effective  coding  function  for  pairs  of  natural  numbers. 

The  elements  of  RD  can  be  thought  as  equivalence  classes  of  (partial)  sequences 
of  integers.  Each  equivalence  class  is  composed  by  sequences  containing  identical 
information  about  the  real  value  they  approximate.  The  relationship  existing 
between  the  real  line  and  the  infinite  elements  of  RD  can  be  clarified  by  means 
of  following  functions: 

Definition  7.  A  function  qp  :  RD  ->  'P(E)  is  defined  by: 

qv(d)=  pl  (o,6] 

[q.,6] 

Conversely,  three  functions  e, e“, e”*”  :  IR  — )■  RD  are  defined  by: 
e{x)  —  {[u,  ^  DI  I  X  e  (u,^)} 

e"(j;)  =  {[a,  b]  e  DI  |  a:  G  (a,  6]}  e'^{x)  ~  {[a,  6]  G  DI  \  x  ^  [a,  b)} 

where  (a,  b)  indicates  the  open  interval  from  a  to  b  and  (a,  6]  and  [a,  b)  indicate 

the  obvious  part  open,  part  closed  intervals. 

Proposition  8.  The  following  statements  hold: 

i)  for  every  infinite  element  d  G  RD  there  exists  a  real  number  x  such  that 
qp{d)  =  {x} 

a)  for  every  real  number  x,  {x}  —  qp  o  e{x)  =  qp  o  e~{x)  —  qp  o  e~{x), 

Hi)  for  every  non-dyadic  number  x  £R/D,  e{x)  =  e~{x)  =  e^{x), 

iv)  for  every  dyadic  number  x  E  D,  e{x)  C  e~{x),  e{x)  IZ  e’^{x)  and  e~{x)  is 
not  consistent  with  e'^{x), 

v)  e(E)  Ue“(]R)  Ue'''(]R)  is  equal  to  the  set  of  infinite  elements  of  RD. 

We  can  say  that  the  infinite  elements  of  RD  are  a  close  representation  of  the 
real  line,  the  set  of  infinite  elements  in  RD  looks  like  the  real  line  except  that 
each  dyadic  number  is  triplicated. 

In  [DG96]  it  is  shown  how  to  solve  the  problem  of  multiple  representations 
by  means  of  a  retract  construction. 
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Fig.  1.  The  diagram  representing  RD, 


4  PCF  Extended  with  Real  Numbers 

In  this  section  we  use  the  domain  RD  introduced  above,  to  define  an  extension 
of  the  language  PCF  having  a  ground  type  for  the  real  numbers.  We  call  Cr 
this  extension.  We  will  prove  that  any  computable  function  on  RD  is  definable 
by  a  suitable  expression  m  Cr.  A  programming  language  similar  to  Cr  has  been 
introduced  in  [DG93].  An  extension  of  PCF  based  on  a  different  domain  of 
approximation  for  the  real  numbers  has  been  presented  in  [Esc96]. 

Compared  with  the  real  computation  described  in  Section  2,  the  real  com¬ 
putation  in  Cr  has  several  advantages.  Given  a  closed  term  M  E 
the  value  EvalR(M)^  can  be  undefined  for  several  reasons.  For  example: 

(i)  there  can  be  a  term  N  representing  a  real  number  such  that  the  sequence  of 
((MiV)O), . . . ,  {{MN)n), . . .  does  not  define  a  real  number. 

(ii)  there  can  be  two  terms  A^i  and  N2  defining  the  same  real  number  and  such 
that  {MNi)  and  {MN2)  define  different  real  numbers. 

The  language  Cr  is  free  from  these  inadequacies.  Terms  of  type  r  in  Cr  can 
always  be  interpreted  as  an  (approximated)  real  and  more  importantly  terms 
of  type  r  ^  r  preserve  the  equivalence  between  different  representations  of  the 
same  real  number.  We  can  say  that  Cr  defines  an  abstract  data  type  for  real 
numbers.  It  defines  a  collection  of  primitive  functions  on  reals  which  generate 
any  other  computable  function. 

The  types  of  Cr  are  the  PCF  types  extended  with  a  new  ground  r.  The  set 
T  of  type  expressions  is  defined  by  the  grammar: 

a:=  i\o\r\a 


The  terms  of  Cr  are  the  terms  of  Cpa+^  extended  with  the  new  constants: 


(-1),  (+1),  (x2),  (-2),  PR  :  r->r, 

(<  0)  :  r  o  pif^  :  o  ^  r  ^  r  ^  r, 

We  define  Cr  giving  its  denotational  semantics.  To  this  end  we  use  the  set  of 
Scott-domains,  UD  —  {D^-  |  cr  G  T},  where  =  Z^,  Do  —  Dr  —  RD 

and  D(j—!^r  —  \]-)a  — ^  -^r]- 

The  denotation  of  the  new  constants  is  the  following: 
the  constants  (+1),  (—1),  (x2),  {-r2)  realize  the  corresponding  functions  on  reals. 

[(+l)]p(d)  =  {[a  +  1, 6  +  1]  I  [a, b]  G  d} 
li-l)Ud)  =  {[a-l,b-l]\[a,b]^d} 

I(x2)]p(d)  =  {[a  X  2,  6  X  2]  I  [a,  6]  G  d  A  [a  x  2,6  x  2]  G  RI} 

[(“2)1p(^)  ”  U[a,6]€rf  4-  T  2, 6  2] 

The  constant  (<  0)  tests  if  a  number  is  smaller  or  larger  than  0. 

(tt  if  it  exists  [a,  6]  G  d,  6  <  0 
ff  if  it  exists  [a,  6]  G  d,  0  <  a 
_L  otherwise 

The  constant  PR  defines  a  kind  of  projection  on  the  interval  [—1, 1]. 

{dU  I  [-1, 1]  if  d  is  consistent  with  |  [-1, 1] 
e+(-l)  if3[a,6]Gd.6<-l 

e“(l)  if  3[a,  6]  G  d.a  >  1 

The  constant  pif^  defines  a  parallel  test. 

fd  if  e  =  tt 
d'  ife  =  ff 
dnd'  ife  =  X 

If  the  boolean  argument  is  undefined  the  function  [pif^Jp  gives  as  output  the 
most  precise  approximation  of  the  second  and  third  argument. 

It  is  not  difficult  to  prove  that  for  every  closed  expression  and  environ¬ 
ment  p,  is  a  computable  element  of  D^r.  Next  we  prove  the  universality 

of  Cr,  that  is,  we  prove  that  every  computable  functions  on  RD  is  definable 
by  a  suitable  term  in  Cr-  In  order  to  do  this  we  present  a  generalisation  of  the 
universality  theorem  for  PCF  [Plo77,  Theorem  5.1].  The  generalisation  applies 
to  any  extension  of  PCF  where  ground  types  are  denoted  by  coherent  domains. 
The  proof  in  [Plo77]  works  only  for  flats  domains.  An  equivalent  generalisation 
has  already  been  given  in  [Str94] .  In  that  work  the  proof  is  based  on  categorical 
arguments  and  uses  as  a  lemma  the  original  result  in  [Plo77].  Our  proof  follows 
the  line  of  the  original  proof  and  it  is  more  direct.  Some  definitions  and  lemmata 
are  necessary  here. 
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Definition  9.  A  subset  A  of  a  partial  order  P  is  coherent  if  any  pair  of  elements 
has  an  upper  bound.  A  coherent  domain  is  a  Scott-domain  for  which  any  coherent 
subset  has  an  upper  bound. 

Coherent  domains  are  closed  for  many  semantics  functors.  In  particular  if  Di 
and  D2  are  coherent  domains  then  [Di  D2]  is  a  coherent  domain.  Moreover 
the  domain  RD  is  coherent. 

A  fundamental  step  in  the  proof  of  universality  consists  in  showing  that  for 
every  type  a  it  is  possible  to  define  three  functions,  namely,  Ca,  Pa  and  #cr- 
Where  and  are  respectively  a  test  and  a  projection  function  for  the  types 
a,  while  #^(n)(d)  cheeks  if  the  element  d  is  inconsistent  with  the  finite  element 
e‘^(n)  (where  is  the  effective  enumeration  of  the  finite  elements  of  the  domain 
Da  ([Plo77,  page  249])).  Formally: 

Definition  10.  A  partial  function  /  :  ^  Da  is  definable  in 

Cr  if  there  exists  a  closed  term  M  such  that  for  all  di  6  Da^  -  ■  -  dn  ^  Da^  if 
/(di) . . .  (dn)  is  defined  then  [M]p(di) . . .  (dn)  =  /(di) . . .  (dn). 


Definition  11.  Given  a  coherent- domain  Da  the  function 

.  B±_  Da  ^  Da  Da,  and  the  partial  functions  #a  ■  Da  ^ 

Pa  Da  Da  are  defined  by: 

r  di  if  h-tx. 

Camdi){d2)={  d2 

i  di  n  d2  i/  6  =  ± 


#^(n)(d) 


!ff  if  n  G  N,  (n)  C  d 

tt  if  n  G  N,  (n)  and  d  are  inconsistent 

undefined  if  n  is  a  negative  number 
_L  otherwise 


J  d  U  e^(n)  if  n  G  N,  d,  e^  (n)  are  consistent 
PaWK  )  -  I  undefined  otherwise 

Lemma  12.  If ,  in  a  language  extending  Cpa^s  wdh  new  ground  types,  for  every 
ground  type  r  the  function  Cr,Pri  definable  by  some  terms  then 

for  any  other  type  a  the  functions  Ca,pa^ta  are  definable  by  some  suitable  terms 
Da,  Da- 

Lemma  13.  If  in  an  extension  of  the  language  C  for  a  type  cf  the  function  pa 
is  definable  then  every  computable  element  in  Da  is  definable. 


Theorem  14.  For  every  computable  element  d  in  Da  there  exists  a  closed  ex¬ 
pression  M  in  Cr  such  that:  [Af]p  =  d. 
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5  Operational  Semantics,  a  First  Attempt 

In  this  section  we  discuss  the  problem  of  defining  an  operational  semantics  for 
Cr  In  Section  3  the  elements  of  RD  are  constructed  as  equivalence  classes  of 
partial  sequences  of  integers.  One  can  use  functions  in  [Zx  ^  ^±]  to  represent 
sequences  of  integers  and  hence  elements  in  RD.  Following  this  approach  one 
can  use  higher  order  function  of  [Zx  ^  Zx]  to  represent  functions  on  RD.  The 
construction  is  the  following.  Let  S'  be  the  subset  of  [Zx  — >■  Zx]  defined  by, 

S'  =  {s\^ieN.{  5(i  +  l)  s(i)  #  ±  A  25(z)-1  <  ^(z  +  l)  <  2(i)  +  l  ))} 

the  elements  of  S'  define  the  partial  sequences  of  digits  representing  elements  in 
RD.  Let  (j)'  S'  ^  RD  be  the  function, 

^'(«)  =  n{[^,  I »  e  N,.(i)  #  X}. 

Given  a  function  g  on  RD,  for  example,  g  :  RD  RD  — )■  RD,  we  say  that  g  is 
represented  by  a  function  /  :  [Zx  — ^  Zx]  — >  ^  ^±]  — >■  ^x]  if  for  all 

Sl,S2  e  S',  g{<l>' {si)){<l>' {S2))  =  <l>'(f(si)(S2)). 

The  above  representation  for  functions  on  RD  suggests  the  following  ap¬ 
proach  to  operational  semantics:  for  each  new  constant  c  in  Cr  one  try  to  find  a 
computable  function  fc  on  [Zx  ->  Zx]  representing  the  function  [c].  If  the  func¬ 
tions  fc  would  exist  then  a  set  of  closed  £pA+3-terms  Me  such  that  ElMdp  =  fc, 
would  define  an  operational  semantics  for  Cr-  The  operational  semantics  would 
be  given  by  the  reductions  rules  c  Me-  In  fact  the  operational  behaviour  of  Me 
is  in  accordance  with  the  denotational  semantics  of  c.  Unfortunately  this  natural 
approach  is  doomed  to  failure.  In  fact  the  function  [pif^Jp  cannot  be  represented 
by  any  functional  on  integers.  We  state  this  negative  result  in  a  more  general 
setting,  considering  not  only  the  real  number  representation  of  Definition  1  and 
the  corresponding  domain  RD  but  a  large  class  of  real  number  representations 
and  domains  of  approximations. 

In  almost  all  the  representations  considered  in  the  literature  a  real  number 
is  represented  by  a  sequence  of  elements  of  a  countable  set  C.  For  example  C 
can  be  a  set  of  digits,  the  set  of  integers,  the  set  of  p-adic  rational  numbers,  the 
set  of  rational  numbers,  the  set  of  rational  intervals. 

Definition  15.  A  sequence  representation  for  the  real  numbers  is  given  by  a 
countable  set  C,  a  subset  5  of  N  C  and  a  representation  function  u  :  5  — ^  M. 
The  set  S  is  the  subset  of  sequences  defining  real  numbers. 

Repeating  the  construction  of  Section  3  we  map  finite  sequences  to  subsets  of 
reals. 

Definition  16.  Given  a  sequence  representation  v  :  S  R,  its  extension  to 
partial  sequences  F  :  [N  — Gx]  ^  is  defined  by, 

F(s)  =  {u(t)  \  t  e  S,s  ^t}. 

Given  a  sequence  s  and  a  natural  number  n  we  indicate  with  s|n  the  partial 
sequence  containing  the  first  n  elements  of  s:  s|n  (m)  =  s{m)  if  m  <  n, 

(m)  =  T  otherwise.  In  [Wei87,  pages  479-482]  it  has  been  introduced  the 


129 


notion  of  admissible  representation  for  real  numbers.  That  definition  can  be 
reformulated  as  follows. 

Definition  17.  A  sequence  representation  (5,  u)  is  admissible  if  it  satisfies  the 
following  conditions, 

(i)  Vs  G  5 .  Vc  G  E.  6  N.i;(s|^)  is  contained  in  an  interval  having  width  e, 

(ii)  For  each  real  number  x  there  exists  a  sequence  s  such  that  for  each  n,  x  is 
contained  in  the  interior  of  n(s|n)- 

Condition  (i)  states  that  the  function  :  5  ->  1  is  continuous,  w.r.t.  the  Cantor 
topology  on  5  and  the  Euclidean  topology  on  M.  Almost  all  the  representation 
functions  used  in  computable  analysis  are  admissible. 

Any  sequence  representation  induces  an  information  order  on  partial  se¬ 
quences:  s  is  below  t  in  the  information  order  if  F(s)  We  have  the  following 

negative  result. 

Theorem  18.  For  any  admissible  representation  v,  and  there  is  no  continuous 
functional  p  :  [N  — >■  [N  — >■  C^x]  [N  — >•  Cx]  such  that: 

(i)  g  implements  addition,  that  is:  for  all  s,t  in  S,  v{g{s){t))  =  v{s)  -f  v{t)) 

(ii)  g  respects  the  induced  order  relation  on  partial  functions  that  is:  for  all 
s,s',t,t'  in  [N  Cx],  ^(s)  3  u(s')  and  v{t)  D  v{t')  implies  v{g{s){t))  D 
v{g{s){t)). 

The  previous  theorem  implies  that,  if  we  use  an  admissible  then  the  operational 
semantics  of  Cr  cannot  be  given  in  terms  of  computations  on  sequences.  This 
result  generalises  to  any  domain  derived  from  an  admissible  representation  and 
to  any  calculus  define  on  the  derived  domain.  There  are  two  possible  solutions 
to  this  problem.  The  first  one  consists  in  introducing  non  deterministic  or  inten- 
sional  operators  in  the  language.  The  second  one  consists  in  using  representations 
that  are  not  admissible,  but  that  are  suitable  for  real  number  computations.  The 
first  approach  has  been  followed  in  [Esc96],  there  the  operational  semantics  of  a 
language  similar  to  Cr  is  given  using  a  non  deterministic  operator.  Here  we  will 
follow  the  second  approach. 

6  An  Operational  Semantics 

The  notations  considered  so  far  in  the  literature  represent  real  numbers  using 
sequences  that  are  completely  defined.  It  is  possible  to  represent  real  numbers 
using  sequences  that  are  undefined  on  some  elements.  An  example  is  the  follow¬ 
ing. 

Definition  19.  A  real  number  x  in  the  interval  [-1,1]  is  represented  by  a  se¬ 
quence  s  of  digits  -1,1  such  that:  x  =  XlieN  no<i<i 

This  notation  is  similar  to  the  binary  digit  notation.  The  main  differences  con¬ 
sist  in  the  use  of  the  digit  —1  instead  of  the  digit  0  and  in  the  fact  that  in 
this  notation  the  value  of  a  digit  affects  the  weights  of  all  the  consecutive  dig¬ 
its.  In  this  notation  the  real  number  0  has  two  representations:  the  sequence 
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(-1,-1, 1,1,1...)  and  the  sequence  (1,  -1, 1,1,1.. .).  The  two  representations 
differ  just  for  the  first  digit.  Hence  0  can  also  be  represented  by  the  sequence 
(J.,  -1, 1, 1, 1 . . .)  undefined  on  the  first  element.  Moreover  examining  the  finite 
initial  parts  of  the  incomplete  sequence  it  is  possible  to  determine  the  number 
represented  by  it  with  an  arbitrary  precision.  Similar  considerations  hold  for 
any  other  dyadic  rational  number.  Every  real  number  that  is  not  rational  dyadic 
has  exactly  one  representation.  If  we  allow  as  possible  representations  for  the 
dyadic  rational  numbers  also  the  sequences  undefined  on  one  element  we  obtain 
a  representation  suitable  for  the  real  number  computation. 

In  order  to  represent  the  whole  real  line  we  consider  the  following  notation. 

Definition  20.  A  representation  function  u  :  (N  {  —  1, 1})  K  is  defined  by: 
v(s)  =  s(o)  X  (fc + ]q  s(j)/2) 

i>k k<j<i 

where  k  —  min{i  |  i  >  0,  s{i)  =  —1} 

This  is  a  sort  “sign,  integer  part,  mantissa”  notation  for  the  real  numbers. 
The  first  digit  gives  the  sign,  the  next  consecutive  positive  digits  determine  the 
integer  part,  the  remaining  part  of  the  sequence  is  the  mantissa.  Also  in  this  case 
every  dyadic  rational  number  is  represented  by  two  functions  that  differ  just  for 
one  element  and  every  real  number  that  is  not  rational  dyadic  has  exactly  one 
representation. 

Definition  21.  The  extension  of  v  to  partial  functions  is  the  function 
u  :  (N  {-1,  l}i_)  defined  by: 

The  set  ^(s)  is  an  interval  if  and  only  if 
Vn  .  {s{n)  t  As{n  +  1)  |) 

^  Vm  <  n .  s{m)i  A  s(n  +  1)  —  -1  A  Vm  >  n  +  1 .  (s{m)t  V  s{m)  =  1). 

Let  denote  the  set  of  partial  functions  s  such  that  v{s)  is  an  interval. 
is  a  complete  partial  order.  If  we  repeat  the  construction  of  Section  3,  with  the 
representation  v  and  the  set  of  partial  elements  we  obtained  a  new  domain 
for  real  numbers.  We  call  the  new  domain  RDA  In  this  case  no  pair  of  elements 
in  5°°  contain  the  same  information.  It  follows  that  5^  and  RD'  are  isomorphic. 
The  structures  of  RD  and  RD'  are  quite  similar.  The  main  difference  consists 
in  the  fact  that  RD'  contains  for  each  natural  number  n  the  intervals  [— oo,  —n] 
and  [n,  +oo]  and,  as  a  consequence,  the  infinite  points  — oo  and  4-oo. 

Proposition  22.  There  exists  an  effective  embedding-projection  pair  {e,p)  from 
to  [N  {-I,  l}±],  p:[N  ^  {-1, 1}  J  is  defined  by: 

Pis)  =  |J{s'  e  S~  I  s'  r  5} 

e  :  5*^  [N  {-I,I}J  -)•  is  the  identity  functions. 
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It  follows  that  there  exists  an  effective  embedding-projection  pair  (CryPr)  from 
RD'  to  [Z±  {tt,  The  embedding-projection  can  be  extended  to  the 

functions  spaces. 

ea-^rif)  =  erofoq^ 

Qa^rif)  =qTO  foCa 

Repeating  the  considerations  presented  in  Section  5,  it  is  possible  to  represent 
elements  in  RD'  by  theirs  embeddings  in  [Z_l  ->  {tt,ff}x]  and  functions  on  RD' 
(5°°)  by  the  corresponding  embeddings  on  functions  spaces  of  [Z±  ->  {tt,ff}x]. 
Let  C  be  the  set  of  the  new  constants  in  j  each  ^  C  let  Mc’^  be  a  term  in 
CpA+3  defining  the  function  eo-([c'^]p).  By  the  universality  of  Cpa+b  the  terms 
Me-  exists.  An  operational  semantics  for  Cr  can  be  given  adding  to  the  set 
single-step  reduction  rules  for  CpA+s  the  new  set  of  rules  {c  — ^  Me  |  c  €  C}. 
For  lack  of  space  we  do  not  present  the  actual  set  of  rules. 
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Abstract.  In  the  1980’s,  Bennett  introduced  computational  depth  as  a 
formal  measure  of  the  amount  of  computational  history  that  is  evident 
in  an  object’s  structure.  In  particular,  Bennett  identified  the  classes  of 
weakly  deep  and  strongly  deep  sequences,  and  showed  that  the  halting 
problem  is  strongly  deep.  Juedes,  Lathrop,  and  Lutz  subsequently  ex¬ 
tended  this  result  by  defining  the  class  of  weakly  useful  sequences,  and 
proving  that  every  weakly  useful  sequence  is  strongly  deep. 

The  present  paper  investigates  refinements  of  Bennett’s  notions  of  weak 
and  strong  depth,  called  recursively  weak  depth  (introduced  by  Fenner, 
Lutz  and  Mayordomo)  and  recursively  strong  depth  (introduced  here).  It 
is  argued  that  these  refinements  naturally  capture  Bennett’s  idea  that 
deep  objects  are  those  which  “contain  internal  evidence  of  a  nontrivial 
causal  history.”  The  fundamental  properties  of  recursive  computational 
depth  are  developed,  and  it  is  shown  that  the  recursively  weakly  (re¬ 
spectively,  strongly)  deep  sequences  form  a  proper  subclass  of  the  class 
of  weakly  (respectively,  strongly)  deep  sequences.  The  above-mentioned 
theorem  of  Juedes,  Lathrop,  and  Lutz  is  then  strengthened  by  proving 
that  every  weakly  useful  sequence  is  recursively  strongly  deep.  It  follows 
from  these  results  that  not  every  strongly  deep  sequence  is  weakly  useful, 
thereby  answering  a  question  posed  by  Juedes. 


1  Introduction 

Computational  depth  was  introduced  by  Bennett  [2,3]  as  a  formal  measure  of 
the  amount  of  computational  history  that  is  evident  in  the  structure  of  a  com¬ 
putational,  physical,  or  biological  object.  Roughly  speaking,  if  x  is  an  object 
(such  as  a  computer  program,  a  point  in  a  phase  space,  or  a  DNA  sequence) 
that  can  be  encoded  in  binary  in  a  natural  way  —  in  which  case  we  identify  x 
with  its  encoding  —  then  the  computational  depth  of  x  is  the  amount  of  time 
required  for  a  computation  to  derive  x  from  its  shortest  binary  description.  Like 
Solomonoff  [13],  Bennett  regards  a  description  of  a;  as  a  formal  analog  of  a  sci¬ 
entific  explanation  of  x.  By  Occam’s  razor,  then,  the  shortest  description  of  x 

*  This  research  was  supported  in  part  by  National  Science  Foundation  Grant  CCR- 
9157382,  with  matching  funds  from  Rockwell,  Microware  Systems  Corporation,  and 
Amoco  Foundation. 
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is  the  most  plausible  explanation  of  x,  and  the  computational  depth  of  x  is  the 
amount  of  time  required  for  an  effective  process  to  generate  x  from  its  most 
plausible  explanation.  Bennett  thus  says  that  a  deep  object  is  “one  whose  most 
plausible  origin,  via  an  effective  process,  entails  a  lengthy  computation,”  and, 
more  succinctly,  that  a  deep  object  is  one  that  contains  “internal  evidence  of  a 
nontrivial  causal  history”  [3]. 

In  order  to  avoid  undue  sensitivity  to  the  underlying  computational  model, 
Bennett’s  definition  of  depth  refers  not  only  to  an  object’s  shortest  description, 
but  to  all  descriptions  of  the  object  that  have  nearly  minimal  length.  This  is 
achieved  by  adding  a  significance  parameter  to  the  definition.  Specifically,  for 
c  €  N,  the  computational  depth  of  an  object  x  at  significance  level  c  is  the  time 
required  for  a  computation  to  derive  x  from  a  binary  description  tt  that  is  itself 
compressible  by  no  more  than  c  bits.  (That  is,  every  description  of  tt  consists  of 
at  least  |7r|  —  c  bits.) 

For  (infinite,  binary)  sequences,  Bennett  [2,3]  introduced  two  interesting 
depth  conditions,  strong  depth  and  weak  depth.  A  sequence  S  is  strongly  deep  if, 
for  every  computable  time  bound  t  :  N  N  and  every  constant  c  G  N,  for  all  but 
finitely  many  n  6  N,  the  n-bit  prefix  5[0..n  -  1]  of  5  has  depth  greater  than  t{n) 
at  significance  level  c.  If  we  regard  a  description  tt  from  which  5[0..n  —  1]  can  be 
derived  in  at  most  t{n)  computation  steps  as  a  t{n)- compression  of  5[0..n  -  1], 
then  this  says  that,  for  all  computable  time  bounds  t  and  constants  c,  for  all  but 
finitely  many  n,  every  t{n) -compression  of  5[0..n  —  1]  is  itself  compressible  by 
more  than  c  bits.  Thus  a  sequence  is  strongly  deep  if  no  computable  time  bound 
suffices  to  compress  infinitely  many  of  its  prefixes  to  within  a  constant  number 
of  bits  of  the  optimal  compression. 

To  put  the  matter  more  fancifully,  no  matter  how  (computably)  much  time  is 
spent  looking  for  inner  structure  (i.e.,  basis  for  compression)  in  a  strongly  deep 
sequence,  an  unbounded  quantity  of  such  inner  structure  remains  undiscovered. 
A  strongly  deep  sequence  is  thus  analogous  to  a  great  work  of  literature  for  which 
no  number  of  readings  suffices  to  exhaust  its  value. 

It  was  shown  by  Bennett  [3]  (and  also  in  [7])  that  no  sequence  that  is  either 
decidable  or  random  (i.e.,  algorithmically  random  in  the  sense  of  Martin-L6f  [10]) 
can  be  strongly  deep.  However,  strongly  deep  sequences  do  exist.  For  example, 
Bennett  [3]  noted  that  K,  the  diagonal  halting  problem,  is  strongly  deep.  This 
is  because  K,  unlike  a  decidable  or  random  sequence,  can  be  used  (as  an  oracle) 
to  decide  any  decidable  sequence  within  a  computable  (in  fact,  polynomial)  time 
bound  that  does  not  depend  on  the  sequence. 

This  relationship  between  depth  and  usefulness  (as  an  oracle)  was  investi¬ 
gated  more  explicitly  and  generally  by  Juedes,  Lathrop,  and  Lutz  [7],  who  defined 
strong  and  weak  usefulness  conditions  for  sequences.  A  sequence  S  is  strongly 
useful  if  there  is  a  fixed  computable  time  bound  t  :  N  N  such  that  the  set 
DTIME‘^(t),  consisting  of  all  sequences  that  can  be  decided  in  t{n)  time  using 
the  oracle  5,  contains  every  decidable  sequence,  i.e.,  REC  C  DTIME‘^(t),  where 
REC  is  the  set  of  all  decidable  sequences.  A  sequence  S  is  weakly  useful  if  there 
is  a  fixed  computable  time  bound  i  :  N  ->  N  such  that  the  set  DTIME‘^(t)  does 
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not  have  measure  0  in  REC,  i.e.,  DTIME‘^(t)  n  REC  is  a  nonnegligible  subset  of 
REC  in  the  sense  of  the  recursive  case  of  the  resource-bounded  measure  theory 
developed  by  Lutz  [9].  That  is,  S'  is  weakly  useful  if  a  nonnegligible  set  of  decid¬ 
able  sequences  can  be  decided  within  a  computable  time  bound  that  may  depend 
on  S  but  does  not  depend  on  the  sequence  being  decided.  By  the  above  remark, 
K  is  strongly  useful.  It  is  evident  that  every  strongly  useful  sequence  is  weakly 
useful,  and  Fenner,  Lutz,  and  Mayordomo  [4]  have  shown  that  the  converse  does 
not  hold,  so  the  set  of  strongly  useful  sequences  is  properly  contained  in  the  set 
of  weakly  useful  sequences. 

Juedes,  Lathrop,  and  Lutz  [7]  proved  that  every  weakly  useful  sequence  is 
strongly  deep.  This  generalized  Bennett’s  observation  that  K  is  strongly  deep 
and  gave  formal  support  to  Bennett’s  informal  arguments  relating  depth  and 
usefulness.  Strong  depth  is  a  necessary  condition  for  weak  usefulness.  Juedes 
[6]  subsequently  asked  whether  the  converse  is  true,  i.e.,  whether  strong  depth 
actually  characterizes  weak  usefulness. 

In  this  paper,  we  show  that  weakly  useful  sequences  have  a  strictly  stronger 
depth  property  than  strong  depth,  thereby  answering  Juedes ’s  question  nega¬ 
tively.  In  fact,  this  stronger  depth  property,  a  constructive  refinement  of  strong 
depth  called  recursively  strong  depths  is  the  main  topic  of  this  paper. 

In  the  terminology  used  above  to  describe  strong  depth,  a  sequence  5  is 
recursively  strongly  deep  (briefly,  vec-strongly  deep)  if,  for  every  computable  time 
bound  t  and  constant  c,  there  exists  a  computable  time  bound  I  such  that, 
for  all  but  finitely  many  n,  every  ^(n)-compression  of  5[0..n  —  1]  is  itself  l{n)- 
compressible  by  more  than  c  bits.  It  is  the  existence  of  this  computable  time 
bound  I  that  distinguishes  rec-strong  depth  from  strong  depth.  Returning  to  the 
more  fanciful  language  used  earlier,  no  matter  how  (computably)  much  time  is 
spent  looking  for  inner  structure  in  a  rec-strongly  deep  sequence,  and  no  matter 
now  much  additional  structure  (any  constant  number  of  bits)  one  wishes  to  find, 
there  is  always  a  greater  (computable)  amount  of  time  that  suffices  to  find  that 
much  more  structure.  A  rec-strongly  deep  sequence  is  thus  analogous  to  a  great 
work  of  literature  with  the  property  that,  no  matter  how  many  times  it  has 
been  read,  there  is  a  greater  number  of  readings  from  which  one  can  derive 
significantly  more  value. 

In  this  paper,  we  establish  the  existence  of  sequences  that  are  strongly  deep 
but  not  rec-strongly  deep.  Such  a  sequence  S  must  have  the  following  two  prop¬ 
erties. 

(i)  There  exist  a  fixed  computable  time  bound  to  •  N  — >•  N  and  a  fixed  constant 
Co  e  N  such  that,  for  every  computable  time  bound  /  :  N  N,  there  are 
infinitely  many  prefixes  S[0..n  -  1]  of  5  that  have  to(?^)-compressions  that 
are  not  ^(n)-compressible  by  co  or  more  bits. 

(ii)  For  every  constant  c  ^  N  (no  matter  now  much  larger  than  co),  for  all  but 
finitely  many  prefixes  5[0..n  —  1]  of  S',  every  to(^)-compression  of  S[0..n  —  1] 
is  itself  compressible  by  more  than  c  bits. 

By  (i),  none  of  the  additional  compression  (beyond  Co  bits)  promised  in  (ii)  can  be 
realized  within  any  computable  time  bound.  Once  again  comparing  a  sequence 
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to  a  work  of  literature  and  taking  a  number  of  readings  as  an  analogy  for  a 
computable  time  bound,  a  sequence  that  is  strongly  deep  but  not  rec-strongly 
deep  is  analogous  to  a  work  of  literature  for  which  no  number  of  readings  exhausts 
its  value,  but  some  number  of  readings  does  exhaust  all  the  value  that  can  be 
exhausted  by  any  number  of  readings. 

Using  Bennett’s  terminology,  a  rec-strongly  deep  sequence  S  shows  evidence 
of  a  nontrivial  causal  (computational)  history  in  the  constructive,  incremental 
sense  that  every  explanation  of  S  that  can  be  realized  by  an  effective  process  of 
computable  duration  is  significantly  less  plausible  than  some  other  explanation 
of  S  that  can  also  be  realized  by  an  effective  process  of  some  greater  computable 
duration.  In  contrast,  a  sequence  that  is  strongly  deep  but  not  rec-strongly  deep 
has  an  explanation  that  (i)  can  be  realized  by  an  effective  process  of  computable 
duration,  and  (ii)  is  as  plausible  as  any  other  explanation  that  can  be  realized 
by  an  effective  process  of  computable  duration.  Although  such  a  sequence  does 
have  a  more  plausible  explanation,  there  is  no  constructive  evidence  of  this  fact. 

None  of  the  above  should  be  taken  to  imply  that  rec- strong  depth  is  a  better 
(or  worse)  notion  than  strong  depth.  Both  notions  merit  further  investigation. 
In  the  case  of  rec-strong  depth,  there  are  several  reasons  for  this.  First,  as  noted 
above,  rec-strongly  deep  sequences  show  evidence  of  a  “nontrivial  causal  his¬ 
tory”  in  a  natural,  constructive,  incremental  sense.  Second,  as  we  show  in  this 
paper,  rec-strong  depth  enjoys  the  same  useful  slow-growth  property  (and  con¬ 
sequent  upward  closure  under  truth-table  reductions)  that  Bennett  [3]  proved 
for  strong  depth.  Third,  as  we  show  in  this  paper,  rec-strong  depth  can  be  used 
to  separate  weak  usefulness  from  strong  depth,  thereby  answering  Juedes’s  ques¬ 
tion.  Fourth,  as  developed  below,  rec-strong  depth  is  based  on  a  recursive  depth 
function  (with  an  additional  latency  parameter),  and  therefore  provide  a  use¬ 
ful  model  for  the  design  and  analysis  of  implementahle  depth  measures  such  as 
the  compression  depth  introduced  by  Lathrop  [8].  Fifth,  and  perhaps  most  com¬ 
pelling,  we  show  that  the  relationships  among  rec-strong  depth,  the  notion  of 
rec- weak  depth  introduced  by  Fenner,  Lutz  and  Mayordomo  [4],  and  the  notion 
of  rec-randomness  that  has  been  investigated  by  Schnorr  [11,12],  van  Lambal- 
gen  [14],  Lutz  [9],  Wang  [15],  and  others  correspond  closely  to  the  relationships 
among  strong  depth,  weak  depth  and  algorithmic  randomness. 

This  paper  is  largely  self-contained.  It  can  be  read  independently  of  [3,7], 
but  we  assume  that  [7]  is  at  hand  for  reference.  At  the  end  of  this  section,  we 
introduce  a  small  amount  of  terminology  and  notation.  Section  2,  the  main  sec¬ 
tion  of  this  paper,  presents  rec-strong  depth,  rec-weak  depth,  and  our  results  on 
these  notions.  Section  2  is  divided  into  a  preamble  and  four  (sub-)sections.  In  the 
preamble,  we  develop  the  above-mentioned  recursive  depth  function,  depthj,(rc). 
In  section  2.1  we  use  this  function  to  introduce  rec-strong  depth.  In  section  2.2 
we  prove  the  deterministic  slow  growth  law  for  recursive  computational  depth 
and  establish  the  basic  inclusion  relations  among  the  weak,  strong,  rec-weak. 
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and  rec-strong  depth  classes,  namely, 

rec-wkDEEP 

O/  xO 

rec-strDEEP  wkDEEP. 

strDEEP 

In  section  2.3  we  prove  that  all  these  inclusions  are  proper  by  proving  that  the 
classes  rec-wkDEEP  and  strDEEP  are  incomparable.  Both  directions  of  the  in¬ 
comparability  proof  are  nontrivial.  One  direction  yields  the  stronger  fact  that 
rec-random  sequences  can  be  strongly  deep,  while  the  other  direction  uses  the 
recursive  version  of  the  first  Borel-Cantelli  lemma  [9]  in  a  Baire  category  argu¬ 
ment.  In  section  2.4  we  prove  that  every  weakly  useful  sequence  is  rec-strongly 
deep,  thereby  answering  Juedes’s  question.  Proofs  of  our  results  appear  in  the 
full  version  of  this  paper. 

We  work  in  the  Cantor  space  C,  consisting  of  all  (infinite,  binary)  sequences. 
A  string  icG{0,l}’'isa  prefix  of  a  sequence  5  €  C,  and  we  write  it;  C  5  if  there 
is  a  sequence  AeC  such  that  S  -  wA.  For  5  6  C  and  n  €  N,  we  write  5[n]  for 
the  bit  of  S  and  5[0..n  -  1]  for  the  n-bit  prefix  of  5.  The  complement  of  a 
set  W  C  C  is  the  set  =  C  —  X. 

We  write  REC  for  the  set  of  all  decidable  sequences  in  C  and  rec  for  the  set 
of  all  computable  (total)  functions  from  {0, 1}*  to  {0, 1}’'.  Identifying  strings 
with  their  indices  n  in  the  standard  enumeration  of  {0, 1}*,  we  also  write  rec  for 
the  set  of  all  computable  functions  from  N  to  N. 

2  Recursive  Computational  Depth 

As  noted  by  Bennett  [3],  the  value  depth^(K;)  -  the  computational  depth  of  a 
string  w  at  significance  level  c  -  is  not  computable  from  w  and  c.  The  following 
definition  remedies  this  at  the  expense  of  introducing  an  additional  variable. 


Definition.  For  w  e  {0,1}*  and  c,  /  6  N,  the  recursive  computational  depth  of 
u)  at  significance  level  c  with  latency  I  is 


depth',(iy)  =  min  e  N  I  (Btt  G  PROG‘(i/;))  KI  <  K‘M  +  c} 


That  is,  depthJ.(K;)  is  the  minimum  amount  of  time  required  to  obtain  w  from 
a  program  tt  that  cannot  itself  be  obtained  in  time  I  from  a  program  that  is  c 
or  more  bits  shorter  than  n.  It  is  clear  that  depthj,(iii)  is  computable  from  id, 
c,  and  /;  this  is  why  it  is  called  the  recursive  computational  depth.  Two  other 
properties  of  depth[,(i(;)  are  immediately  evident.  For  each  w  6  (0, 1}*  and  c  6  N, 
depth  J.(it;)  is  nondecreasing  in  /,  and  lim/^oo  depthj,(ii;)  =  depths  (it;).  For  each 
w  G  {0,1}*  and  I  G  N,  the  value  depthc(it;)  is,  like  depthc(ii;),  nonincreasing  in 
c. 
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2.1  Recursive  Depth  Classes 

We  begin  by  defining  the  recursive  analogs  of  the  depth  classes  D^(n)  and  D* 
introduced  in  [7]. 


Definition.  For  t,  g,l  :  N  ^  and  n  €  N,  define  the  sets 


D‘''(n)  =  {5eC 


depth 


1])  >  t(n)} 


and 


CXD  OO 


m— 0  n—m 


(V°°n)5eD‘’'(n)}. 


Note  that 
D 


=  {5  e  C  I  (VTre  PROG‘(5[0..n-l]))/'(''<"H7r)  <  |7r| 


(It  is  crucial  here  that  the  left-hand  side  of  the  inequality  is  not  /F^(7r), 

i.e.,  that  the  time  bound  is  l{n),  not  /(|7r|).) 


Definition.  Let  t,g  :  N  ^  N.  A  sequence  5  €  C  is  recursively  t-deep  at  sig¬ 
nificance  level  g,  and  we  write  S  e  if  there  is  a  computable  function 

/  ;  N  ^  N  such  that  S  G  ^ 

It  is  clear  that,  for  alH,  :  N  ->  N  with  I  computable,  ^  C  D^’  C  D^. 
To  define  recursively  strong  depth,  we  substitute  for  in  the  definition 

of  strong  depth. 


Definition.  A  sequence  5  G  C  is  recursively  strongly  deep  (or,  briefly,  rec- 
strongly  deep),  and  we  write  S  G  rec-strDEEP,  if  for  every  computable  time 
bound  t  :  N  M  and  every  constant  c  G  N,  5  G  D*’ 

We  note  that  every  rec-strongly  deep  sequence  is  strongly  deep.  Since  REC  n 
strDEEP  =  0  [3]  (see  also  [7]),  it  follows  immediately  that  no  recursive  sequence 
can  be  rec-strongly  deep. 

Recall  that  a  sequence  S  is  strongly  deep  if,  for  every  computable  time  bound 
t  and  constant  c,  all  but  finitely  many  prefixes  of  S  can  be  described  at  least 
c  bits  more  succinctly  without  a  time  bound  than  with  the  time  bound  t.  In 
contrast,  a  sequence  S  is  rec-strongly  deep  if,  for  every  computable  time  bound  t 
and  constant  c,  there  exists  a  computable  time  bound  I  such  that  all  but  finitely 
many  prefixes  of  S  can  be  described  at  least  c  bits  more  succinctly  with  the  time 
bound  I  than  with  the  time  bound  t.  Very  informally,  a  sequence  is  strongly  deep  if 
it  has  more  regularity  than  can  be  explained  by  a  causal  (computational)  history 
of  any  computable  duration.  For  a  sequence  to  be  rec-strongly  deep,  it  must  also 
be  the  case  that,  for  every  computable  duration  t  there  is  a  larger  computable 
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duration  I  such  that  more  of  the  sequence’s  regularity  can  be  explained  by  a 
causal  history  of  duration  I  than  can  be  explained  by  a  causal  history  of  duration 

t. 

Our  next  result  states  that  rec-strongly  deep  sequences  cannot  be  rec-random. 

Theorem  1.  RAND(rec)  H  rec-strDEEP  ^  0,  In  fact,  there  exist  a  computable 
function  t{n)  =  (9(nlogn)  and  a  constant  c  G  N  such  that  RAND(rec)  flD^’  = 

0. 

Recursively  weak  depth  was  introduced  by  Fenner,  Lutz,  and  Mayordomo 
[4].  We  write  rec-wkDEEP  for  the  class  of  all  rec- weakly  deep  sequences. 


2.2  Class  Inclusions 

In  this  section,  we  establish  the  basic  inclusion  relations  that  hold  among  the 
weak  and  strong  depth  classes  defined  in  [7]  and  section  2.1.  For  this  and  later 
purposes,  we  need  a  technical  lemma.  This  result,  called  the  deterministic  slow- 
growth  law  for  recursive  computational  depth,  places  a  quantitative  upper  bound 
on  the  ability  of  a  time-bounded  oracle  Turing  machine  to  amplify  the  depth  of 
its  oracle.  Details  appear  in  the  full  version  of  this  paper. 

An  easy  consequence  of  the  Slow  Growth  Lemma  is  the  fact  that  the  class 
of  rec-strongly  deep  sequences  is  (like  the  class  of  strongly  deep  sequences  [7]) 
closed  upwards  under  tt-reductions. 

Theorem  2.  Let  A,B  e  C.  If  B  <tt  A  and  B  is  rec-strongly  deep,  then  A  is 
rec-strongly  deep. 

We  now  come  to  the  main  result  of  section  2.2.  The  following  theorem  gives 
the  inclusion  relations  that  hold  among  the  weak,  strong,  rec-weak,  and  rec- 
strong  depth  classes. 

Theorem  3.  The  following  diagram  of  inclusions  holds. 


rec-wkDEEP 


rec-strDEEP 


strDEEP 


wkDEEP 


2.3  Class  Separations 

We  now  show  that  all  four  inclusions  in  Theorem  3  are  proper.  It  is  most  efficient 
(and  most  informative)  to  prove  this  by  proving  the  two  non-inclusions 

StrDEEP  ^  rec-wkDEEP 
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and 


rec-wkDEEP  ^  strDEEP, 

We  prove  these  in  succession. 

We  prove  that  strDEEP  ^  rec-wkDEEP  by  proving  the  much  stronger  fact 
that  strongly  deep  sequences  can  be  recursively  random.  We  do  this  by  examining 
the  Kolmogorov  and  the  time-bounded  Kolmogorov  complexities  of  recursively 
random  sequences. 

We  first  prove  that  rec-random  sequences  have  very  high  time-bounded  Kol¬ 
mogorov  complexities. 

Theorem  4.  Assume  that  S  is  rec-random  and  that  t,g  :N  ^  N  are  computable 
functions  with  g  nondecreasing  and  unbounded.  Then,  for  all  hut  finitely  many 
n  G  N, 

K\S[0..n  -  1])  >  n  -  g{n). 


The  function  g  above  may  be  very  slowly  growing,  e.g.,  an  inverse  Acker- 
mann  function.  Theorem  4  thus  says  that,  for  every  rec-random  sequence  S 
and  computable  time  bound  t,  all  but  finitely  many  of  the  prefixes  of  S  have 
AT^-complexities  that  are  nearly  as  large  as  their  lengths. 

We  next  show  that  the  situation  is  very  different  in  the  absence  of  the  time 
bound  t. 

Definition.  A  sequence  5  G  C  is  ultracompressible  if,  for  every  computable, 
nondecreasing,  unbounded  function  ^  :  N  N,  there  exists  G  N  such  that, 
for  all  n  >  Ug, 

K{S[0..n-l])  <K{n)+g{n).  (1) 


It  is  clear  that  every  n-bit  string  w  must  satisfy  K{w)  >  K(n)  —  0(1).  A 
sequence  S  is  thus  ultracompressible  if,  for  every  computable,  nondecreasing, 
unbounded  (but  perhaps  very  slowly  growing)  function  g,  for  all  but  finitely 
many  n,  the  n-bit  prefix  of  S  has  i^-complexity  that  is  within  g{n)  bits  of  the 
minimum  possible  A"-complexity  for  an  n-bit  string. 

We  now  show  that  a  rec-random  sequence  can  be  ultracompressible.  Simi¬ 
lar  results  have  been  proven  by  Wang  [15]  and  Ambos-Spies  and  Wang  [1]  for 
the  monotone  Kolmogorov  complexities  of  rec-random  sequences.  The  present 
result  is  slightly  stronger  than  these  results  in  that  it  gives  a  single  rec-random  se¬ 
quence  5  that  has  property  (1)  for  every  computable,  nondecreasing,  unbounded 
function  g.  The  proof  is  based  in  part  on  a  simpler,  unpublished  construction 
by  Gasarch  and  Lutz  [5]  of  a  rec-random  sequence  that  is  not  algorithmically 
random. 

Theorem  5.  There  is  a  rec-random  sequence  that  is  ultracompressible. 
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We  now  note  that  rec-random  sequences  can  be  strongly  deep. 

Theorem  6.  There  is  a  rec-random  sequence  that  is  strongly  deep. 

Theorem  6  contrasts  sharply  with  Theorem  1  and  the  fact  that  RAND  Pi 
strDEEP  =  0.  There  is  of  course  nothing  paradoxical  in  this  contrast.  It  is 
merely  a  consequence  of  the  strong,  quantitative  separation  of  RAND(rec)  from 
RAND  given  by  Theorem  5. 

We  now  have  the  first  of  the  desired  noninclusions. 

Corollary  7.  strDEEP  ^  rec-wkDEEP. 

The  following  known  theorem  says  that  the  set  of  strongly  deep  sequences  is 
small  in  the  sense  of  Baire  category. 

Theorem  8  (Juedes,  Lathrop,  and  Lutz  [7]).  The  class  strDEEP  is  mea¬ 
ger. 

We  show  that  rec-wkDEEP  ^  strDEEP  by  showing  that  rec-wkDEEP  is 
comeager.  Our  proof  of  this  fact  is  somewhat  more  involved  than  the  proof  by 
Juedes,  Lathrop,  and  Lutz  [7]  that  wkDEEP  is  comeager. 

Theorem  9.  For  each  uniform  reducibility  F ,  the  class  rec-F-deep  is  rec-comeager, 
hence  comeager  in  REC. 

Theorem  10.  The  class  rec-wkDEEP  is  comeager. 

Corollary  11.  rec-wkDEEP  ^  strDEEP. 

We  now  have  the  main  result  of  section  2.3. 


By  Theorem  12,  there  exist  sequences  that  are  strongly  deep,  but  not  rec- 
strongly  deep.  Let  S  be  such  a  sequence.  Since  S  is  not  rec-strongly  deep,  there 
exist  a  fixed  computable  time  bound  to  •  ^  ^  ^  and  a  fixed  constant  cq  G  N 
such  that,  for  every  computable  time  bound  /  :  N  ^  N,  there  are  infinitely  many 
prefixes  of  S  that  cannot  be  described  Cq  bits  more  succinctly  with  the  time 
bound  I  than  with  the  time  bound  ^o-  Nevertheless,  since  S  is  strongly  deep,  it 
must  be  the  case  that,  for  every  constant  c  €  N  (even  when  c  is  much  greater 
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than  Co),  all  but  finitely  many  prefixes  of  S  can  be  described  at  least  c  bits  more 
succinctly  without  a  time  bound  than  with  the  time  bound  to.  None  of  this 
additional  succinctness  (beyond  Cq  bits)  can  be  realized  within  any  computable 
time  bound;  all  of  it  requires  greater-than-computable  running  time.  The  depth 
of  such  a  sequence  S  appears  not  to  come  from  so  much  from  a  nontrivial  causal 
(computational)  history  as  from  something  utterly  noncomputational. 

If  F  is  a  uniform  reducibility  that  is  (like  all  standard  reducibilities)  reflexive, 
then  the  measure  and  category  of  the  class  rec-F-DEEP  are  of  some  interest. 
First,  rec-F-DEEP  must  be  disjoint  from  RAND(rec),  so  rec-F-DEEP  must  be 
a  measure  0  subset  of  C.  Also,  by  Theorem  9,  rec-F-DEEP  must  be  comeager. 
Thus,  the  class  rec-F-DEEP  is  small  in  the  sense  of  measure,  but  large  in  the 
sense  of  Baire  category.  This  state  of  affairs  is  not  unusual  and  would  not  be 
worth  mention,  were  it  not  for  the  fact  that  the  situation  changes  when  we  look 
at  the  measure  and  category  of  rec-F-DEEP  in  REC.  By  [4]  and  Theorem  9, 
rec-F-DEEP  is  large  in  REC  in  the  senses  of  both  measure  and  category.  The 
class  rec-F-DEEP  is  thus  one  concerning  which  measure  and  category  agree  in 
REC,  but  disagree  in  C. 

2.4  Weakly  Useful  Sequences 

Juedes,  Lathrop,  and  Lutz  [7]  defined  the  class  of  weakly  useful  sequences  and 
proved  that  every  weakly  useful  sequence  is  strongly  deep.  Fenner,  Lutz,  and 
Mayordomo  [4]  subsequently  proved  that  every  weakly  useful  sequence  is  rec- 
weakly  deep.  In  this  section,  we  strengthen  both  these  results  by  proving  that 
every  weakly  useful  sequence  is  rec-strongly  deep.  Our  argument  closely  follows 
that  of  [7]. 


Definition  (Juedes,  Lathrop,  and  Lutz  [7]).  A  sequence  A  eCis  strongly  useful, 
and  we  write  A  e  strUSEFUL,  if  there  is  a  computable  time  bound  s  :  N  ^  N 
such  that  REC  C  DTIME^(s).  A  sequence  A  e  C  is  weakly  useful,  and  we  write 
A  £  wkUSEFUL,  if  there  is  a  computable  time  bound  5  :  N  ->  N  such  that 
DTIME^(s)  does  not  have  measure  0  in  REC. 

Thus  a  sequence  is  strongly  useful  if  it  enables  one  to  solve  all  decidable 
sequences  in  some  fixed,  computable  amount  of  time.  A  sequence  is  weakly  useful 
if  it  enables  one  to  solve  all  elements  of  a  nonnegligible  set  of  decidable  sequences 
in  some  fixed,  computable  amount  of  time. 

Recall  that  the  diagonal  halting  problem  is  the  sequence  K  whose  bit  is 

K[n]  =  [Mn(n)  halts], 

where  is  a  standard  enumeration  of  all  deterministic  Turing  ma¬ 

chines.  It  is  well-known  that  K  is  polynomial-time  many-one  complete  for  the 
set  of  all  recursively  enumerable  subsets  of  N,  so  K  is  strongly  useful. 

It  is  clear  that  every  strongly  useful  sequence  is  weakly  useful.  Fenner,  Lutz, 
and  Mayordomo  [4]  used  martingale  diagonalization  to  construct  a  sequence  that 
is  weakly  useful  but  not  strongly  useful,  so  strUSEFUL  U  wkUSEFUL. 


We  now  establish  the  rec-strong  depth  of  weakly  useful  sequences. 

Theorem  13.  Everxj  weakly  useful  sequence  is  rec-strongly  deep. 

Juedes  [6]  asked  whether  every  strongly  deep  sequence  is  weakly  useful.  We 
now  answer  this  question  negatively. 

Corollary  14.  wkUSEFUL  C  strDEEP 
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Abstract.  We  study  the  computational  power  of  Piecewise  Constant 
Derivative  (PCD)  systems.  PCD  systems  are  dynamical  systems  defined 
by  a  piecewise  constant  differential  equation  and  can  be  considered  as 
computational  machines  working  on  a  continuous  space  with  a  continu¬ 
ous  time.  We  show  that  the  computation  time  of  these  machines  can  be 
measured  either  as  a  discrete  value,  called  discrete  time,  or  as  a  continu¬ 
ous  value,  called  continuous  time.  We  prove  that  the  languages  recognized 
by  PCD  systems  in  dimension  d  in  finite  continuous  time  are  precisely 
the  languages  of  the  d-  2*^  level  of  the  arithmetical  hierarchy.  Hence  we 
provide  a  precise  characterization  of  the  computational  power  of  purely 
rational  PCD  systems  in  continuous  time  according  to  their  dimension 
and  we  solve  a  problem  left  open  by  [2]. 


1  Introduction 

There  has  been  recently  an  increasing  interest  in  the  community  of  control  and 
verification  theory  about  hybrid  systems.  A  hybrid  system  is  a  system  that  com¬ 
bines  discrete  and  continuous  dynamics.  Hybrid  systems  can  be  also  be  consid¬ 
ered  as  computational  machines:  they  can  be  seen  either  as  machines  working  on 
a  continuous  space  with  a  discrete  time  or  as  machines  working  on  a  continuous 
space  with  a  continuous  time. 

The  first  point  of  view  has  been  investigated  in  [1,  2,  4,  5].  In  particular,  in 
[L  2,  3]  the  attention  is  focused  on  a  very  simple  type  of  hybrid  systems;  Piece- 
wise  Constant  Derivative  Systems  (PCD  systems)  are  dynamical  systems  defined 
by  a  piecewise  constant  differential  equation.  It  is  shown  that  the  reachability 
problem  for  PCD  systems  is  decidable  in  dimension  d  =  2  and  undecidable  in 
dimension  d  >  3  [1,  3]  .  In  [4],  the  computational  power  of  Piecewise  Constant 
Derivative  systems  is  characterized  as  P/poly  in  polynomial  discrete  time,  and 
as  unbounded  in  exponential  discrete  time. 

This  paper  deals  with  the  second  point  of  view  that  considers  hybrid  systems 
as  machines  that  work  on  a  continuous  space  with  a  continuous  time.  The  study 
of  computational  machines  that  work  in  a  continuous  time  is  only  beginning:  in 
[6],  Moore  proposed  a  recursion  theory  for  computations  on  the  reals  in  contin¬ 
uous  time.  Recently,  Asarin  and  Maler  [2]  showed,  using  Zeno’s  paradox,  that 
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every  set  of  the  arithmetical  hierarchy  can  be  recognized  in  finite  continuous 
time  and  in  finite  dimension  by  a  PCD  system:  every  set  of  the  arithmetical 
hierarchy  in  Sk  U  Hk  can  be  recognized  by  a  rational  PCD  system  in  dimension 
+  1.  Unfortunately,  no  precise  characterization  of  the  PCD  recognizable  sets 
was  given  in  [2].  In  this  paper,  we  improve  the  results  of  Asarin  and  Maler  and 
we  provide  a  full  characterization  of  the  sets  recognized  by  purely  rational  PCD 
systems:  we  show  that  the  sets  that  are  recognized  by  purely  rational  PCD  sys¬ 
tems  in  dimension  d  are  precisely  the  sets  of  the  d  -  2^^  level  of  the  arithmetical 
hierarchy. 

Section  2  is  devoted  to  some  general  definitions:  PCD  systems,  computations 
on  PCD  systems,  discrete  and  continuous  time.  In  section  3,  we  improve  5  times 
the  result  of  Asarin  and  Maler:  any  arithmetical  set  in  Ek  can  be  recognized 
in  dimension  2  +  A:.  In  section  4  we  prove  that  this  bound  is  optimal  for  purely 
rational  PCD  systems:  no  other  set  can  be  recognized  in  that  dimension. 

2  Definitions 

A  convex  polyhedron  of  is  any  finite  intersection  of  open  or  closed  half  spaces 
of  A  polyhedron  of  is  a  finite  union  of  convex  polyhedral  of  In 
particular,  a  polyhedron  may  be  unbounded  or  fiat.  For  U  C  we  denote  by 
V  the  topological  closure  of  V.  We  denote  by  d  the  Euclidean  distance  of  A 
rational  point  of  is  a  point  of  with  rational  coordinates. 

Definition!  PCD  System  [1,  2].  A  Piecewise  Constant  Derivative  (PCD) 
system  of  dimension  d  is  a  couple  %  =  (A,  /)  with  X  =  /  :  A  — >  A,  where 

the  range  of  /  is  a  finite  set  C  C  A,  such  that  for  any  c  G  C  (c  is  called  a  slope) 
/"^  (c)  is  a  finite  union  of  convex  polyhedral  sets  (called  regions).  A  trajectory 
ofTi  starting  from,  is  a  continuous  solution  to  the  differential  equation  Xd  = 
f{x),  with  initial  condition  a^o,  where  Xd  denotes  the  right  derivative:  that  is 
D  CM.'^  ^  X  where  D  is  an  interval  of  IR+  containing  0,  #(0)  =  a^o,  and 
Mt  G  D,id(l)  —  /(^(A))-  Trajectory  #  is  said  to  continue  for  ever  if  D  =  M"*". 

In  other  words  a  PCD  system  consists  of  partionning  the  space  into  convex 
polyhedral  regions,  and  assigning  a  constant  derivative  c,  called  slope,  to  all  the 
points  sharing  the  same  region.  The  trajectories  of  such  systems  are  broken  lines 
with  the  breakpoints  occuring  on  the  boundaries  of  the  regions  [2].  See  figure  1. 
The  signature  of  a  trajectory  is  the  sequence  of  the  regions  that  are  crossed  by 
the  trajectory. 

Definition2  Rational,  purely  rational  PCD  systems.  -  A  PCD  system 
is  called  rational  if  all  the  slopes  as  well  as  all  the  polyhedral  regions  can 
be  described  using  only  rational  coefficients. 

-  A  PCD  system  is  called  purely  rational,  if  in  addition,  for  all  trajectory 
0  starting  from  a  rational  point,  each  time  0  enters  a  region  in  a  point  x, 
necessarily  x  has  rational  coordinates. 
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Fig.  1.  A  PCD  system  in  dimension  2. 


Some  comments  are  in  order:  one  must  understand  that  a  trajectory  0  can 
enter  a  region  either  by  a  discrete  transition  or  by  converging  to  a  point  of  the 
region:  see  figure  2.  Thus,  in  other  words,  in  a  purely  rational  PCD  system  any 
converging  process  converges  towards  a  point  with  rational  coordinates.  Note 
that  one  can  construct  a  rational  PCD  system  of  dimension  5  that  is  not  purely 
rational. 

We  can  say  some  words  on  the  existence  of  trajectories  in  a  PCD  system:  let 
6  X.  We  say  that  xq  is  trajectory  well-defined  if  there  exists  a  6  >  0  such 
that  f{x)  =  f(xo)  for  all  x  G  [xq,  xo-\-e*  /(a-’o)]-  It  is  clear  that,  for  any  xq  G  X, 
there  exists  a  trajectory  starting  from  x’o  iff  xq  is  trajectory  well-defined.  Given 
a  rational  PCD  system  %,  one  can  effectively  compute  the  set  N oEvolution(li) 
of  the  points  of  X  that  are  not  trajectory  well-defined.  See  that  a  trajectory  can 
continue  for  ever  iff  it  does  not  reach  N oEvoluiionfH) . 

Definitions  Computation  [2].  —  Let  %  =  be  a  PCD  system  of  di¬ 

mension  d.  Let  /  —  [0, 1]  and  let  r  :  N  ^  /  be  an  injective  coding  func¬ 
tion,  let  x^,  x^  be  two  distinct  points  of  M^.  A  computation  of  system  H  = 
(M^, /,  7’,  on  entry  rz  G  N  is  a  trajectory  that  can  continue  forever 

(defined  on  all  M+)  oiTi  —  (A”,  /)  starting  from  (r(7i),  0, . .  . ,  0).  The  compu¬ 
tation  is  accepting  if  the  trajectory  eventually  reaches  and  refusing  if  it 
reaches  It  is  assumed  that  the  derivatives  at  x^  and  x^  are  zero. 

-  Language  L  C  N  is  semi-recognized  by  H  if,  for  every  n  G  N,  there  is  a 
computation  on  entry  n  and  the  computation  is  accepting  iff  ??,  G  L.  L  is 
said  to  be  (fully-) recognized  by  H  when,  in  addition,  this  trajectory  reaches 

iff  n  ^  L. 

Definition 4  Continuous  and  Discrete  time.  Let  X  he  an  ac¬ 

cepting  computation  on  entry  n  G  N. 

-  The  continuous  time  Tc(n)  of  the  computation  is  T  =  min{i  G  = 

-  Let  Tn  =  crosses  a  boundary  of  a  region  at  time  i}.  It  is  easy  to  see 

that  Tn  is  a  well  ordered  set.  The  discrete  time  Td(n)  of  the  computation  is 
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defined  as  the  order  type  of  well  ordered  set  (=  the  ordinal  corresponding 
to  Tn). 

Note  that  Zeno’s  paradox  appears:  to  a  continuous  finite  time  can  correspond 
a  transfinite  discrete  time:  see  figure  2. 


(-1,1) 

(-1,1/2) 

/ 

) 

.••x/2  X 

\ 

/ 

(1,-1) 

(1,1) 

Fig.  2.  Zeno’s  paradox:  at  finite  continuous  time  5a:  =  2.5(a:  +  a:/2  +  a:/4  +  . . .)  the 
trajectory  is  in  (0,0),  but  it  takes  a  transfinite  discrete  time  uj  to  reach  this  point. 


We  recall  the  following  definition: 

Definitions  Arithmetical  hierarchy  [8,  7].  The  classes  Ek^TIk,  for  k  G 
N,  are  defined  inductively  by: 

-  Eo  is  the  class  of  the  languages  that  are  recursive. 

-  For  k  >  I,  Ek  is  the  class  of  the  languages  that  are  recursively  enumerable 
in  a  set  in  Ek-i  (that  is  semi-recognized  by  a  Turing  machine  with  an  oracle 
in  Ek-i) 

-  For  k  Uk  is  defined  as  the  class  of  languages  whose  complement  are  in 
Ek ,  and  Af-  is  defined  as  Ak  =  11^  r\  E^. 

Several  characterizations  of  the  sets  of  the  arithmetical  hierarchy  are  known: 
see  [7,  8].  In  particular  we  will  assume  the  reader  familiar  with  Tarski-Kuratowski 
computations:  assume  a  first  order  formula  F,  over  some  recursive  predicates, 
characterizing  the  elements  of  a  set  S'  C  hJ,  is  given.  Then  S  is  in  the  arithmetical 
hierarchy  and  the  Tarski-Kuratowski  algorithm  on  formula  F  returns  a  level  of 
the  arithmetical  hierarchy  containing  S:  see  [7,  8]  for  the  full  details. 

3  PCD  Systems  can  Recognize  Arithmetical  Sets 

It  was  shown  in  [2]  that  every  set  of  the  arithmetical  hierarchy  can  be  recognized 
in  finite  continuous  time:  more  precisely,  it  is  shown  that  L  E  Ek  U  Ilk  can  be 
recognized  by  a  PCD  system  of  dimension  5Ar  +  1.  Therefore,  five  dimensions 
are  used  in  [2]  to  climb  each  level  of  the  arithmetical  hierarchy:  one  for  a  timer, 
one  used  for  the  divisions  by  2,  one  used  to  do  the  homogenization,  and  two 
dimensions  used  to  go  from  quantifier  elimination  to  semi-recognition.  We  show 
here  that  only  one  dimension  is  needed  (the  one  used  to  do  the  homogenization), 
and  that  the  construction  only  requires  purely  rational  PCD  systems. 
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Theorem  6,  ~  Any  language  L  of  Uk  is  semi-recognized  by  a  purely  rational 

PCD  system  in  dimension  2  k. 

-  Any  language  L  of  Ak  is  fully-recognized  by  a  purely  rational  PCD  system 
in  dimension  2-\-  k. 

The  proof  is  rather  technical:  timers  are  suppressed  by  using  machines  that 
cross  a  given  hyper-plane  at  regular  time,  divisions  by  two  are  done  by  reusing 
the  variables  defining  the  machines,  and  the  two  variables  used  in  [2]  to  go 
from  quantifier  elimination  to  semi-recognition  are  suppressed  by  storing  some 
information  in  the  variable  used  to  do  the  homogenization. 


4  PCD  Systems  Cannot  Recognize  Any  Other  Set 

4.1  Local  dimension 

We  define: 


Fig.  3.  From  left  to  right:  x*  is  of  local  dimension  1^,  2^ ,  3  in  a  PCD  system  of  dimen 
sion  3. 


Definition?  Local  dimension.  Let  %  -  (X,/)  be  a  PCD  system  in  dimen¬ 
sion  d.  Let  be  a  point  of  X.  Let  be  a  polyhedral  subset  C  X  of  maximal 
dimension  d-d'  {i  <d'  <  d)  such  that  there  exists  an  open  convex  polyhedron 
C  C  X',  with  X*  e  A  nV,  and  such  that,  for  any  region  F  of  Ti,  F  H  V  ^  ^ 
implies  A  C  F  (F  is  the  topological  closure  of  F). 

If  d'  <  d  then  x*  is  said  to  be  of  local  dimension  d'^ .  If  d'  —  d  then  x*  is 
said  to  be  of  local  dimension  d'  and  we  can  always  choose  V  small  enough  such 
that  X*  is  the  only  point  of  local  dimension  d'  in  V:  see  figure  3. 

Note  that  given  a  rational  PCD  system  7i  =  (X,  /)  and  k  =  d'  or  k  — 
one  can  effectively  compute  LocDimfH,  k)  defined  as  the  set  of  the  points  x  ^  X 
that  have  a  local  dimension  equals  to  k. 

The  main  idea  behind  definition  7  is  given  by  the  following  lemma:  see  figure 


4. 
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Fig.  4.  Proposition  8:  if  x*  is  of  local  dimension  2‘*'  in  a  PCD  system  7/  of  dimension 
3,  the  projections  on  P  of  the  trajectories  of  7i  in  neighborhood  V  of  x*  are  precisely 
the  trajectories  of  some  PCD  system  %'  of  dimension  2. 


Proposition  8.  Let  %  —  {X,  f)  be  a  PCD  system  in  dimension  d.  Let  x*  be  a 
point  of  local  dimension  (d')+  with  d'  <  d.  Call  P  the  affine  variety  of  dimension 
d'  which  is  the  orthogonal  of  A  in  x* .  It  is  possible  to  construct  a  PCD  system 
Pi'  -  [X'  —  in  dimension  d'  such  that  the  trajectories  of  Pi'  are  the 

orthogonal  projections  on  P  of  the  trajectories  of  Pi  in  V. 

For  any  point  x* ,  the  corresponding  V  is  denoted  by  14*.  PL',  A  are  respec¬ 
tively  denoted  by  PL^*  and  .  If  d'  <  d  we  denote  by  px*  and  the  functions 
that  map  all  point  x  e  X  onto  its  orthogonal  projection  on  P  and  onto  its 
orthogonal  projection  on  A  respectively.  If  d'  —  d,  we  define  px*  and  qx*  as 
respectively  the  identity  function  and  the  null  function.  We  assume  the  natural 
order  1  <  1+  <  2  <  2+  <  . . .. 

Lemma  9,  Let  Pi  =  [X,  f)  be  a  PCD  system  of  dimension  d.  Let  ^  be  a  tra¬ 
jectory  of  Pi  that  reaches  x*  at  finite  continuous  time  Tc.  Assume  that  x*  is 
of  local  dimension  k  =  d'  or  k  =  (d')'t.  For  any  I,  denote  by  Si  the  set  of  the 
points  X  e  X  that  are  reached  by  0  at  some  time  0  <t  <Tc  and  that  have  local 
dimension  1.  Assume  Si  =  0,  for  all  I  >  k. 

-  Sk  25  a  finite  set. 

-  Assum.e  Sk  =  0-  Fix  the  origin  in  x* .  Then  either  S'(d'-i)+  is  a  finite  set 

or  there  exist  yi,y2  G  X  that  are  reached  by  there  exists  0  <  A  <  1 

such  that  px*{y2)  =  ^Px*iyi)  ^^jd  such  that,  for  all  n  >  1,  ^  reaches  at  a 

time  tn  <  Tc  the  point  defined  by  px*(yn)  -  qx*iyn)  - 

Px*  {yi)  +  (2/2)  —  Px*  (2/1  ))• 

Proof.  Let  m  <  k.  We  prove  first  that  if  Sm  is  not  a  finite  set,  then  #  reaches 
a  point  of  local  dimension  >  m  at  some  time  <  Tc'.  assume  that  Sm  is  not  a 

finite  set.  Tm  =  G  5m}  is  a  well  ordered  set.  Denote  its  elements  by 

tIAt'A , . . .  A'fj , . . Take  =  supi^^tf .  We  have  <  Tc.  Consider  x'f^  = 
By  continuity  of  there  exists  such  that  t  G  ^(0  ^ 


V^m  .  Take  i  6  H  5^.  From  considerations  of  dimensions  about  point 

0(t)  of  local  dimension  m  in  14 m  ,  we  get  that  the  local  dimension  d'^  of  is 
>  m.  From  the  definition  of  we  get  d"  /  m.  Hence  d"  >  m  and  our  claim  is 
proved:  if  5m  is  not  a  finite  set  then  reaches  some  of  local  dimension  >  m. 

The  first  assertion  of  the  lemma  is  an  easy  consequence  of  this  claim  with 
m  =  k. 

For  the  second  assertion,  take  m  —  [d'  -  1)  +  ,  and  assume  that  is 

not  a  finite  set.  From  Sk  =  0,  we  must  have  =  x*  and  =  Tc.  If  k  <  d 
denote  'H'  =  Ur.*  else  take  W  =  %.  Define  as  From  time  up  to 

time  Tc,  is  a  trajectory  of  =  (^^/')  (apply  proposition  8  for  k  <  d), 
reaching  at  time  Tc.  Let  C  be  the  set  of  the  one-dimensional  regions  of 

W  that  intersect  Fj*  =  Px*  (K*)-  We  claim  that  each  time  reaches  a  point  of 
reaches  an  element  of  C\  if  reaches  some  point  x*^  G  X'  of  local 
dimension  {d  -  1)+  at  some  time  t  G  [t”",  Tc],  then  Px*  is  an  element  of  C 

and  contains  x*' .  See  figure  5. 


Trajectory 


Fig.  5.  Proof  of  lemma  9:  here  d  =  d'  =  3.  C  defined  as  the  set  of  the  one  dimensional 
regions  that  intersect  pT*(Fr*)-  ^  i®  aiade  of  a  finite  number  of  segments.  Each  time 
the  trajectory  reaches  a  point  of  local  dimension  2'^  ^  it  reaches  C.  If  the  trajectory 
reaches  two  times  £  in  a  same  segment  then  the  trajectory  is  ultimately  cycling. 


Since  converges  to  px*  since  £  is  a  finite  set,  since  is  infinite, 

reaches  two  times  the  same  element  of  C  in  Px*(yi)  and  Px*(y2)  with 
Px*{y2)  ~  h^x*{yi)  for  some  0  <  A  <  1,  at  some  times  ty,,ty^  with  <  ty^  < 
ty^  <  Tc.  Now  see  that  by  definition  of  all  the  regions  of  71^  intersecting  VJ.* 
contain  Px*(-'^*)  in  their  topological  closure.  Hence  we  have  f(x)  =  for 

all  X  G  G  (0, 1].  If  ^'(t)  is  solution  to  differential  equation  xa  = 

T'{t)  =  X<P'{t/X)  is  also  solution.  As  a  consequence  trajectory  must  reach 
A”p.r*  (2/1)  for  all  n.  From  the  definition  of  71'  this  implies  that  ^  reaches  the  pn 
of  the  lemma  for  all  n  :  see  figure  5. 


4.2  Problems  Reach  and  Conx) 
Define  the  following  problems: 
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Definition  10  Problems  Reachd' ,  Reach Let  k  be  either  of  type  k  —  d'  or 
of  type  k  —  where  d'  is  an  integer. 

-  Instance:  A  purely  rational  PCD  system  R  =  (X,  f)  of  dimension  d,  a  poly¬ 
hedral  convex  subset  1/  C  A,  a  rational  polygon  C  X,  s,  rational  number 
tsup  6  Q,  a  rational  number  Unj  G  Q,  a  rational  point  xq  G  A. 

Question  ^^ReachkiR,  V,xo,x^RinfRsupT'  “Do  all  the  following  conditions 
hold  simultaneously: 

•  trajectory  0  starting  from  reaches  x^  at  some  finite  continuous  time 
Tc 

•  iinf  Tc  ^  i$up 

•  for  any  0  <  t  <  Tc,  a:  =  ^(t)  is  in  P  and  is  of  local  dimension  <  A:.” 

~  Instance:  A  purely  rational  PCD  system  R  =  (A,  /)  of  dimension  d,  a  poly¬ 
hedral  convex  subset  K  C  A,  a  rational  point  a?*  G  A,  a  rational  number 
tsup  G  Q,  a  rational  number  tinf  G  Q,  a  rational  point  a:o  G  A. 

Question  "'Convk{R,V,XQ,x* Rinj RsupT '  “Do  all  the  following  conditions 
hold  simultaneously: 

•  the  trajectory  starting  from  xq  reaches  point  x*  at  some  finite  contin¬ 
uous  time  Tc 

•  a:;*  is  of  local  dimension  k  and  is  in  V 

•  tinj  ^  Tc  ^  tgup 

•  for  any  0  <t  <  T^  x  =  #(t)  is  in  V  and  is  of  local  dimension  <  A:.” 

4.3  Case  d  =  3 

Using  topological  considerations  (the  sphere  of  verifies  Jordan  Theorem  and 
the  arguments  of  [3])  we  prove: 

Lemma  11.  Let  R  —  (A,/)  he  a  PCD  system  of  dimension  d.  Let  <P  he  a 
trajectory  ofR  of  finite  continuous  time  Tc  and  discrete  time  Td  >  u)  converging 
towards  x*  =  ^{Tc).  Assume  that  x*  is  of  local  dimension  <  3"^.  Then  necessarily 
the  signature  of  ^  is  ultimately  cyclic. 

Lemma  12.  The  folloiving  problem  is  decidable: 

Instance:  a  rational  PCD  system  R  —  (A,  /)  of  dimension  d,  a  finite  sequence 
of  distinct  regions  {Fq,  Fi, . . . ,  Fj)  ofR,  a  rational  point  a^o  G  A. 

Question:  “Does  the  trajectory  0  starting  from  xq  have  a  periodic  signature 
of  type  (Fq,Fi,  .  .  .,FjY  and  then  reach  some  point  x*  e  X  of  local  dimension 
<  at  some  finite  continuous  time  t*  ” 

Moreover,  given  a  positive  instance,  one  can  effectively  compute  t*  and  x*  as 
a  function  of  the  coordinates  of  xq. 


With  these  lemmas,  we  prove: 
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Theorem  13.  The  problems  Reachs  and  Reach^-^  are  in  Ei. 

Proof  (sketch ).  We  prove  the  assertion  by  providing  a  Turing  machine  algorithm 
that  (semi-)computes  the  predicates:  to  reply  to  Reach^+['H,  V,  tin/Psup), 

the  general  idea  is  the  following:  we  simulate  step  by  step  the  evolution  of  the 
trajectory  #  starting  from  Simultaneously,  if  we  detect  that  0  crosses  for 
the  second  time  a  given  region,  we  use  lemma  12  to  see  if  the  signature  of  0  is 
entering  or  not  an  infinite  cycle.  If  it  is  so,  still  by  lemma  12,  we  compute  directly 
the  limit  of  the  cycle  x*  and  the  corresponding  time  t*  and  the  simulation  goes 
on  directly  from  new  position  x*  and  time  t* .  We  stop  if  we  reach  x^  or  the 
complement  of  V,  or  if  the  time  reaches  a  value  greater  than  tsup-  From  lemma 
9,  we  know  that  every  point  of  local  dimension  /?  =  3  or  /?  =  3"^  can  only  be 
reached  using  a  finite  number  of  points  of  local  dimension  k.  From  lemma  11 
each  such  point  x  of  local  dimension  k  is  reached  by  a  cyclic  signature  and  is 
dropped  by  the  algorithm. 

4.4  Case  dl  >  4 

We  generalize  theorem  13  to  higher  dimensions.  We  prove  first: 

Lemma  14-  Let  d'  >  4.  Assume  that  Reach ^  ^p  i?each.(^/_2)+  G 

Sq  for  some  integers  p,q.  Then 

~  C‘OnV(i>  G  L/j-,j(ix{p,g+2)  • 

—  C  on.V^/+  G  Ej-j-iax{p,q+2)  ■ 

Proof.  Denote  by  B(x*,  l/ni)  the  ball  of  radius  l/7ii  centered  in  x*  for  the  norm 
of  the  maximum.  For  a  subset  U  C  X ,  denote  its  complement  by  U^.  Let  k  —  d' 
or  k  ~  d'^ .  We  claim: 

Convk[R,  K  xq,x* ,  tin  f )  tsup) 

X*  G  LocDim.{R,k)  A  x*  e  V  A  Ln/  <  tgup 
A  Bj/i  G  e  Q  7/1  G  14*  A  7/1, ^1,^2) 

3y2  €.  3/3,154  G  Q  3A  G  M"*" 

i?eac/i(d/_i)+(7{,  14* ,  yi,  y2,^3,^4) 

P:r*(y2)  =  h'>x*{yi) 

A  I  I  A  <  1 

/l  +  =  i  AT3  >  tinf 

t2  +  =  l  A*/4  <  tsup 

Ux'*(7/i)  +  Ei^i  ^'(<ix*{y2)  -  qx*{yi))  = 

^  V  V77.1  G  N  Reach  i^d‘-2)+{'^^  A",  yi,B[x*  ,\/ni)  )tinj  tipsup  ^2) 

Assume  that  we  have  a  positive  instance  to  formula  Coni;/;:  use  the  notations 
of  definition  10.  Denote  by  S  the  set  of  the  points  that  are  reached  by  0  before 
time  Tc  and  that  have  local  dimension  (d'  —  I)"*".  Since  0  converges  to  x* ,  there 
must  exist  an  yi  =  G  14*,  ^yi  <  that  is  reached  by  0,  and  such  that 
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0  stays  in  I4*  between  time  ty^  and  time  Tc-  yi  is  reached  using  points  of  local 
dimension  <  {d'  —  !)■•■.  If  5  is  not  a  finite  set,  by  lemma  9  the  first  clause  of  the 
disjunction  is  true.  Assume  now  that  5  is  a  finite  set:  we  can  assume  that  ty^  is 
chosen  big  enough  such  that  0  does  not  reach  any  point  of  S  between  time  ty^ 
and  time  For  all  ni  G  N  we  get  that  the  trajectory  starting  from  yi  reaches 
B{x^,  l/ni)  using  only  points  of  local  dimension  <  (d'  —  2)’^.  Hence  the  second 
clause  of  the  disjunction  is  true. 

Conversely,  assume  that  the  right  hand  side  of  the  formula  is  true.  If  the  first 
clause  of  the  disjunction  is  true,  the  trajectory  is  cycling  and  the  formula  Co??- 
should  be  true.  Assume  now  that  the  second  clause  is  true.  For  all  ??.i  G  N,  we  get 
that  there  exists  tm  such  that  0(tni)  €  B(x* ,1/ni).  Denote  Tc  =  supn^^j^tm- 
From  the  continuity  of  0  we  get  that^  0(Tc)  =  x* .  Hence  0  reaches  x*  of  local 
dimension  k  and  formula  Convk  must  be  true. 

The  result  is  now  immediate  by  applying  the  Tarski-Kuratowski  algorithm 
on  the  formula  [8]. 

We  also  prove  in  a  similar  way: 

Lemma  15.  Let  d'  >  4.  Assume  Reach C  Sp  for  some  integer  p.  Then 
ConV(tf 

Proof  (sketch).  For  a  point  a?*  G  A"  of  local  dimension  d,  define  Out^*  as  the 
set  of  the  points  x  ^  X  such  that  the  trajectory  starting  from  x  intersects  the 
complement  of  14*  at  a  discrete  time  less  or  equal  to  one.  We  prove  that,  now, 
the  following  formula  holds: 


Convd>  [R,  V,  xq  ,  X  ,  tin/  5  isup) 

X*  G  LocDim{7i,k)  A  x*  eV  A  tin/  <  tsup  A  dimensionfh)  =  d' 

A  3?/i  G  3/i,t2  G  Q  yi  G  14*  A  Reach(^d'-\)+{'ki,V,xo,yiXiM) 

{  Reach (^d' 2/1 3  ^in/  C  3  ^inf  ~~  ti  T  1) 

A  -^Reach(^d>-i)+{'^,  A,  yi ,  U  N oEvolutionfH)  U  ,  0,  ^2  -  t^up) 
A  ^'Rea.ch^^d' —i)  +  {R  3A,  yi,A,/5up  ^2  3  Ctip  ^2  +  1) 

We  get: 

Theorem  16.  Let  d'  >  3. 

-  Reachd'  is  in  Ed'-i- 
-  Reachdi+  is  in  Xd'-i  if  d'  is  even. 

-  Reach d>+  is  in  Xd>-2  if  d'  is  odd. 

Proof.  The  assertion  is  proved  by  recurrence  over  d'  using  theorem  13,  lemmas 
15  and  14,  by  Tarski-Kuratowski  and  the  fact  that  we  have  for  k  =  d'  or  k  =  : 


^  Note  that  if  function  0  is  not  defined  on  value  Tc,  since  0  is  continuous  with  a 
bounded  right  derivative,  0  can  always  be  extended  to  a  continuous  functions  defined 
on  value  Tc. 
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Reachif  (Tt,  xq,  ,  tinf :  isup) 

K,  a^o,  tinf )  ^sup) 

V  3n  e  N  3  <  x;,  xj,  x^, . . . ,  <  >e  O'*  3  <  io,  ■ .  ■ ,  <n  > 

3  <  tg,  ...  > 

^  X^  =  .To 

VO  <  i  <  n  Convki'H, 

<  1 )+ (7^ ,  V,  Tj^ ,  T 

I  to  i I  tn  tinf 

[  ^ Q  +  ^2  +  •  •  •  “1“  ^  ^sup 

By  Tarski-Kiiratowski  on  formula  n  G  L  ^  3ti  G  N  Reach ^(^))  a:  \  0,/i), 
we  get  the  main  result  of  this  section: 

Corollary  17.  -  If  L  ts  semi- recognized  by  a  purely  rational  PCD  system  of 

dimension  d,  then  L  G  Dd-2- 

—  If  L  is  recognized  by  a  purely  rational  PCD  system  of  dimension  d,  then 
L  G  A.d-2- 

And  by  using  theorem  6: 

Corollary  18.  -  The  languages  that  are  semi-recogmzed  by  purely  rational 

PCD  systems  of  dim.ension  d  in  finite  continuous  time  are  precisely  the  lan¬ 
guages  of  Xd-2 

The  languages  that  are  recognized  by  purely  rational  PCD  systems  of  dimen¬ 
sion  d  in  finite  continuous  time  are  precisely  the  languages  of  Ad-2 
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Abstract.  We  study  the  monadic  case  of  a  decision  problem  know  as 
simultaneous  rigid  E?-unification.  We  show  its  equivalence  to  an  exten¬ 
sion  of  word  equations.  We  prove  decidability  and  complexity  results  for 
special  cases  of  this  problem. 


1  Introduction 

Simultaneous  rigid  jE-unification  is  a  combinatorial  problem  in  equational  logic 
which  is  closely  connected  with  some  formulations  of  the  Herbrand  theorem  and 
with  automated  theorem  proving  by  the  tableau  method  and  the  connection 
(or  mating)  method.  In  this  section  we  define  simultaneous  rigid  ^^-unification, 
discuss  its  connection  with  several  decision  problems  in  logic  and  survey  some 
known  results. 

We  shall  consider  equational  logic,  i.e.  logic  whose  only  predicate  is  the  equal¬ 
ity  predicate  Let  si,ti, . . .  ,Sn^  t^,  s,  t  be  terms.  All  atomic  formulas  in  equa¬ 
tional  logic  are  equations,  i.e.  expressions  of  the  form  s  ~  L  We  do  not  distinguish 
an  equation  s  ~  ^  from  the  equation  t  ~  s.  We  write  si  ti, . . .  ,Sn  ~  tn  ^  s  t 
to  denote  that  the  formula  A. .  .ASn—tn  D  sc=^t)  is  true,  i.e.  it  is  provable 

in  first-order  (classical  or  intuitionistic)  logic.  Equivalently,  we  can  say  that  s 
and  t  lie  in  the  same  class  of  the  congruence  induced  by  . . . ,  Sn—tn}- 

A  rigid  equation  is  an  expression  S  hy  where  is  a  finite  set  of  equations. 
The  set  S  is  called  the  left-hand  side  of  this  rigid  equation,  and  the  equation  s  t 
—  its  right-hand  side.  A  solution  to  a  rigid  equation  . . . ,  Sn—tn}  ky 

is  any  substitution  0  such  that  SiO  ~  hO,. . .  ,SnO  ~  tnO  h  5^  ~  t6.  A  system 
of  rigid  equations  is  a  finite  set  of  rigid  equations.  A  solution  to  a  system  of 
rigid  equations  TZ  is  any  substitution  that  is  a  solution  to  every  rigid  equation  in 
TZ.  The  problem  of  solvability  of  rigid  equations  is  known  as  rigid  E -unification. 
The  problem  of  solvability  of  systems  of  rigid  equations  is  known  as  simultaneous 
rigid  E -unification,  in  the  sequel  abbreviated  as  SREU. 

*  Partially  supported  by  grants  from  NSF,  ONR  and  the  Faculty  of  Science  and  Tech¬ 
nology  of  Uppsala  University. 

**  Supported  by  a  TFR  grant. 
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We  shall  denote  sets  of  equations  by  8,  systems  of  rigid  equations  by  IZ  and 
rigid  equations  by  R.  We  shall  sometimes  write  the  left-hand  side  of  a  rigid 
equation  as  a  sequence  of  equations,  for  example  hv  instead  of 

t-y 

In  [2]  it  is  shown  that  the  decidability  of  SREU  is  equivalent  to  the  decid¬ 
ability  of  some  other  fundamental  problems,  for  example  the  decidability  of  the 
prenex  fragment  of  intuitionistic  logic  with  equality.  We  refer  to  [2,  6]  for  the 
discussion  of  these  problems. 

Best  known  (un) decidability  results  on  SREU  are  the  following:  (i)  SREU 
with  ground  left-hand  sides,  two  variables  and  three  rigid  equation  is  undecidable 
(Veanes  [16]);  (ii)  SREU  with  one  variable  is  DEXPTIME-complete  (Degtyarev, 
Gurevich,  Narendran,  Veanes  and  Voronkov  [3]).  The  last  two  results  imply  a 
complete  classification  of  decidable  prenex  fragments  of  intuitionistic  predicate 
calculus  with  equality:  the  33  fragment  is  undecidable  and  the  V*3V*  fragment 
is  decidable.  All  the  above  mentioned  undecidability  results  require  that  the 
signature  contain  a  function  symbol  of  arity  >  2. 

The  special  case  of  SREU  when  all  function  symbols  have  arity  <  1,  is  called 
monadic  SREU.  The  decidability  of  monadic  SREU  is  an  open  problem.  The 
following  facts  are  known  about  monadic  SREU  (Degtyarev,  Matiyasevich  and 
Voronkov  [4]). 

•  The  word  equation  problem  is  effectively  reducible  to  monadic  SREU.  (This 
fact  shows  that  if  this  problem  is  decidable,  its  decidability  should  be  uneasy 
to  prove.) 

•  Monadic  SREU  with  one  function  symbol  is  decidable  (this  fact  has  a  non¬ 
elementary  proof). 

•  Monadic  SREU  is  decidable  if  and  only  if  it  is  decidable  in  the  signature 
with  two  function  symbols. 

This  paper  studies  monadic  SREU.  Although  the  general  case  remains  an 
open  problem,  we  prove  its  equivalence  to  a  combinatorial  problem  of  words 
defined  in  Section  5.  This  problem  is  defined  in  terms  of  ideals  on  the  set  of 
pairs  of  words  and  called  the  ideal  equation  problem.  We  prove 

Theorem  4  Monadic  SREU  is  decidable  if  and  only  if  the  ideal  equation  prob¬ 
lem  is  decidable. 

We  also  prove  the  decidability  of  some  special  cases  of  monadic  SREU.  In 
Section  4  we  prove  a  result  similar  to  the  main  result  of  [3]: 

Theorem  3  Monadic  SREU  with  one  variable  is  PSPACE-complete. 

Plaisted  [13]  proved  that  SREU  with  ground  left-hand  sides  is  undecidable. 
The  corresponding  monadic  case  is  shown  to  be  decidable  in  Section  3: 

Theorem  2  Monadic  SREU  with  ground  left-hand  sides  is  decidable. 
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The  complexity  of  monadic  SREU  with  ground  left-hand  sides  is  not  known. 
We  prove 

Theorem  1  Monadic  SREU  with  one  variable  and  ground  left-hand  sides  is 
PSPACE-hard. 

2  Preliminaries 

In  this  section  we  introduce  basic  definitions  concerning  terms,  equations,  words, 
word  equations,  automata  and  rewrite  rules.  We  have  to  define  so  many  concepts 
since  it  is  unreasonable  to  expect  the  reader  to  know  everything.  We  also  assert 
some  statements  proved  elsewhere  and  prove  some  properties  of  the  introduced 
notions  which  will  be  used  in  subsequent  sections. 

The  symbol  ^  means  “equal  by  definition” . 

Terms  and  equations.  The  set  of  all  variables  of  a  term  t  is  denoted  var{t).  A 
term  is  ground  iff  it  has  no  variables,  i.e.  var{t)  =  0.  The  symbol  h  denotes  prov¬ 
ability  in  first-order  logic.  When  we  write  ,  v^n  b  where  (^i, . . . ,  ip 

are  formulas,  it  means  provability  of  the  formula  ipi  ^  .  /\ipn  D  Substitutions 

of  terms  ti,..  .,tn  for  variables  xi^...,Xn  are  denoted  {U/xi, . . .  ,tn/xn}-  The 
application  of  such  a  substitution  6  to  a  term  t,  is  the  operation  of  simultane¬ 
ous  replacement  of  all  occurrences  of  Xi  by  ti.  The  result  of  the  application  is 
the  term  denoted  tO.  We  shall  also  apply  substitutions  to  equations  and  sets  of 
equations  and  use  the  same  notation  for  the  result  of  the  application. 

For  any  expression  E  (for  example,  term,  or  a  set  of  equations),  we  denote 
by  El  the  expressions  obtained  from  E  by  the  replacement  of  all  occurrences  of 
the  constant  c  by  a  term  t.  We  write  s[t]  to  denote  a  particular  occurrence  of  a 
subterm  i  of  a  term  s. 

In  this  paper,  we  shall  only  consider  monadic  signatures  consisting  of  a  finite 
set  T  of  unary  function  symbols  and  a  finite  set  C  of  constants.  Such  signatures 
are  denoted  C).  The  set  of  ground  terms  of  this  signature  is  denoted  by  ■ 
We  always  assume  C  /  0  and  hence  ^  0-  For  set  of  equations  E  we  de¬ 

note  by  T{E)  the  set  of  all  terms  occurring  in  E  and  their  subterms.  For  example, 
\{£  -  {f{x)  ~5(c),c~ff(/(x))},  then  T{e)  =  {xj{x),c,g[c),g{f{x))}. 

We  shall  denote  variables  by  x,y,z,  constants  by  a,b,c,d,  function  symbols 
by  /,  5,  h,  terms  by  r,  s,  t  and  substitutions  by  9. 

We  shall  use  the  following  statement  proved  in  Kozen  [9]  or  Shostak  [15]. 

Lemma  1  (Derivability  of  equations  is  in  PTIME)  There  is  a  polynomi¬ 
al-time  algorithm  checking,  by  a  given  finite  set  of  equations  E  and  terms  s,  t, 
whether  h  5  ~  t. 

We  write  £*'  h  iff  for  any  equation  (5  ~  t)  E  E  we  have  h  s  ~  t.  In  the 
sequel  we  shall  use  the  following  lemma  whose  proof  is  standard. 

Lemma  2  (Lemma  on  constants)  Let  E  and  E'  he  sets  of  equations.  For  any 
constant  c  and  term  t,  if  E  \-  E' ,  then  E^  h  E'^. 
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Words  and  finite  automata.  This  section  defines  words  and  finite  automata. 
We  shall  also  introduce  a  notation  for  monadic  terms  which  allows  us  to  easily 
come  from  terms  to  words  and  back. 

Let  .7^  be  a  finite  non-empty  set,  called  the  alphabet.  Its  elements  are  called 
letters.  Words  are  finite  sequences  of  letters.  We  denote  words  by  a  juxtaposition 
of  its  letters,  as  W  =  aia2  ...  an-  The  natural  number  n  is  called  the  length  of 
the  word  W  and  denoted  \W\.  We  denote  by  e  the  empty  word,  which  is  the 
unique  word  of  length  zero.  The  set  of  all  words  with  letters  in  T  is  denoted  by 

It  will  be  convenient  for  us  to  use  the  alphabet  T  also  as  the  set  of  unary 
function  symbols  of  a  monadic  signature  {T,C).  Every  term  s  in  such  a  signature 
has  the  form  h{h{-  ■  ■  fn(t)  -  -  ))  where  n  >  0,  /i, . . . , /n  are  unary  function 
symbols  and  t  is  a  constant  or  a  variable.  We  shall  denote  such  a  term  s  in  the 
reversed  Polish  notation,  i.e.  as  i/n  • . .  /2/i-  Thus,  every  term  can  be  represented 
in  the  form  tW,  where  t  is  a  constant  or  a  variable  and  W  is  a  word.  Similarly, 
any  term  of  the  form  /i(/2(-  ■  •  fn{i)  -  •  •)):  where  t  is  an  arbitrary  term,  will  be 
written  as  tfn  •  • .  /2/i- 

A  finite  automaton  A  on  the  alphabet  is  a  quadruple  {Q,I,T,E),  where 
Q  is  a  finite  set,  called  the  set  of  states,  I  and  T  are  distinguished  subsets  of  Q, 
called  the  sets  of  initial  and  terminal  states,  respectively.  The  set  E  CQxExQ 
is  the  set  of  edges  of  A.  An  edge  {p,  f,  q)  is  also  denoted  p  ^  q.  The  automaton 
is  deterministic  iff  whenever  {p,f,qi)  €  E  and  {p,f,q2)  ^  E,  then  qi  —  q2. 

A  word  fi...fn  is  recognized  by  an  automaton  {Q,I,T,E)  iff  there  is  a 

sequence  of  states  qQ...qn  such  that  qo  ^  I,  qn  ^  T  and  qi-i  A  qi  for  all 
i  e  {1, . . .  ,n}.  A  set  of  words  is  regular  iff  it  is  the  set  of  words  recognized  by 
some  automaton. 

The  intersection  nonemptyness  of  deterministic  finite  automata  problem  is 
the  following  decision  problem.  Given  any  finite  set  {Ai, . . . ,  An}  of  deterministic 
finite  automata,  is  there  a  word  recognized  by  each  automaton  in  this  set.  The 
following  statement  is  proved  in  Kozen  [10]: 

Lemma  3  The  intersection  nonemptyness  of  deterministic  finite  automata  prob¬ 
lem  is  PSPACE- complete. 

Word  equations.  In  addition  to  the  alphabet  J-,  we  shall  also  consider  a  count¬ 
able  set  V  of  word  variables,  denoted  u,v,w.  A  word  equation  is  any  expression 
of  the  form  V  where  P,  W  e  (.F  U  V)*.  A  word  substitution  is  any  expres¬ 

sion  cr  -  {Vi/ui, . . .  ,Vnlvn},  where  vi  are  word  variables  and  Vi  are  words  in 
T* .  Its  domain,  denoted  dom(a)  is  the  set  {ui, . . .  ,Un}.  The  application  of  such 
a  word  substitution  0  to  a  word  W  is  the  operation  of  simultaneous 

replacement  of  all  occurrences  of  Vi  by  Vi.  The  result  of  the  application  is  the 
word  denoted  Wa.  k  word  substitution  cr  is  a  solution  to  a  word  equation  U  c::iV 
iff  all  variables  in  U,V  belong  to  dom{a)  and  we  have  Ua  =  Va.  A  system  of 
word  equations  is  any  finite  set  of  word  equations,  its  solution  is  any  substitu¬ 
tion  solving  all  equations  in  the  system.  Words  will  be  denoted  by  U,V,W,  word 
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variables  by  u,v^w  and  word  substitutions  by 

Makanin  [11]  proved  that  word  equations  are  decidable.  Analyzing  Makanin’s 
algorithm,  Schultz  [14]  proves  the  following  result. 

Lemma  4  (Decidability  of  word  equations  with  regular  constraints) 

The  problem  of  solvability  of  word  equations  where  every  word  variable  ui  ranges 
over  a  regular  set  Si,  is  decidable. 

It  is  known  that  the  problem  of  solvability  of  word  equations  is  NP-hard.  No 
good  upper  bound  for  the  complexity  of  this  problem  has  been  obtained  so  far, 
it  is  only  known  that  the  problem  is  in  3-NEXP  (Koscielski  and  Pacholski  [7,  8]). 


Equational  logic  and  rigid  equations.  Let  77.  be  a  system  of  rigid  equations. 
The  signature  of  TZ  is  defined  as  the  signature  consisting  of  all  constants  and 
function  symbols  occurring  in  77.;  and  in  addition  a  fixed  constant  if  77.  contains 
no  constants.  A  solution  ^  to  77.  is  called  grounding  for  77  iff  for  every  variable  x 
occurring  in  77  the  term  x6  is  ground.  A  substitution  6  is  called  relevant  for  77 
iff  for  every  variable  x  the  term  x9  is  in  the  signature  of  IZ. 

In  the  sequel,  we  shall  need  the  following  technical  property  of  systems  of 
rigid  equations. 

Lemma  5  (Existence  of  relevant  grounding  solutions)  Let  TZ  be  a  solv¬ 
able  system  of  rigid  equations.  Then  there  exists  a  solution  9  to  IZ  that  is  ground¬ 
ing  and  relevant  for  77. 

We  shall  introduce  one  particular  kind  of  rigid  equations  that  will  be  used  as 
a  technical  tool  for  proofs  in  this  paper.  For  any  monadic  signature  {T,C),  any 
variable  x  and  any  constant  c  6  C  introduce  the  following  rigid  equation; 

Gr(^jrc){x)  ^  {d  ~  c  I  d  6  C}  U  {cf 
We  shall  use  the  following  obvious  lemma: 

Lemma  6  A  substitution  9  is  a  solution  to  Gr(^j:^c){^)  iff  ^ 

As  a  consequence,  we  have 

Lemma  7  For  any  system  77  of  rigid  equations  there  is  a  system  TZ'  of  rigid 
equations  such  that  for  any  substitution  9,  9  is  a  solution  to  TV  if  and  only  if  9 
is  a  grounding  relevant  solution  to  TZ.  In  addition,  TV  can  he  found  by  TZ  using 
a  polynomial-time  algorithm;  and  TZ'  has  ground  left-hand  sides  ifIZ  has  ground 
left-hand  sides. 

Proof.  Let  xi,. . .  ,Xn  be  all  variables  in  TZ  and  {F,C)  be  the  signature  of  77.  Define 
77'  ^  77  U  {Gr^j:^c){^i)  I  *  ^  {1,  •  ■  • , Then  apply  Lemma  6. 
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Rewrite  rules.  This  section  introduces  a  technique  standard  in  the  theory  of 
ground  systems  of  rewrite  rules.  However,  we  shall  use  ordinary  equations  instead 
of  rewrite  rules. 

Introduce  an  ordering  >-  on  terms  in  T(^j=',c)  the  following  way.  Let  >  be 
any  total  ordering  on  T^JC  and  s  ~  cfi . . .  fm,  t  =  dgi . . .  Pn-  Then  s  y  t  iE  one 
of  the  following  conditions  is  true; 

1.  m  >  n\ 

2.  m  =  n  and  the  string  c/i  . . .  fm  is  greater  than  dgi  . . .  gn  in  the  lexicographic 
ordering  induced  by  >. 

The  ordering  >-  is  total,  noetherian  and  can  be  extended  to  a  simplification 
ordering  [1].  Some  properties  of  the  ordering  formulated  below  are  simple  con¬ 
sequence  of  standard  statements  in  the  theory  of  rewrite  systems.  Their  proofs 
may  be  found  in  e.g.  [1].  Note  that  the  ordering  depends  on  the  ordering  of 
>.  In  the  definitions  below  we  assume  that  we  have  chosen  a  fixed  ordering  > 
on  TUC,  and  hence  is  also  fixed. 

Let  he  finite  sets  of  ground  equations  and  E  contains  distinct  equations 
s  C:;  i  and  r[s]  ~  u.  We  say  that  E'  is  obtained  from  E  by  simplification  from 
s  t  into  r[s]  cr:  u,  denoted  iff 

E'  -  (E\  {r[s]  li})  U  {r[t]  u} 

The  reflexive  and  transitive  closure  of  the  relation  ^  on  sets  of  ground  equations 
is  denoted  by  — >■*.  A  set  of  equations  E  is  called  irreducible  iff  there  exists  no  E' 
such  that  E  E' . 

Let  E  be  an  irreducible  set  of  ground  equations.  We  write  t  t'  if  there 
exists  an  equation  (r  ~  s)  G  such  that  r  y  s,  and  t'  is  obtained  from  t  by  the 
replacement  of  one  occurrence  of  the  subterm  r  by  s.  The  relation  is  the 
reflexive  and  transitive  closure  of  .  A  term  t  is  called  irreducible  with  respect 
to  E  iff  there  is  no  term  s  such  that  t  s.  The  normal  form  of  a  term  t  w.r.t. 
E,  denoted  i  is  the  term  s  such  that  t  s  and  s  is  irreducible  w.r.t  T. 
The  normal  form  of  any  term  exists  and  is  unique.  We  shall  use  the  following 
statements  which  are  easy  to  prove. 

Lemma  8  LetE  be  an  irreducible  set  of  ground  equations  and  s,t  be  terms.  Then 
E  \-  s  ^  t  if  and  only  if  s  i£=  t 


Mixing  words  and  rigid  equations.  We  call  a  word  term,  or  simply  w-term, 
in  the  signature  (T,C)  any  expression  of  the  form  cW  such  that  c  G  C  and 
W  G  (JFU  V)*.  A  w-equation  is  any  expression  cV  ~  dW ,  where  cV  and  dW 
are  w-terms.  A  rigid  w-equation  is  any  expression  of  the  form  W  cV  ~  dW, 
where  >V  is  a  finite  set  of  w-equations,  cV  and  dW  are  w-terms.  A  system  of 
rigid  w-equations  is  any  finite  set  of  rigid  w-equations.  The  signature  of  a  system 
of  rigid  w-equations  is  defined  similar  to  that  of  a  system  of  rigid  equations.  Sets 
of  w-equations  will  be  denoted  by  W,  and  sets  of  rigid  w-equations  by  S. 
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A  solution  to  a  rigid  w-equation  >V  hy  cF  ~  dW  is  any  word  substitution  a 
whose  domain  contains  all  word  variables  in  yV,V,W  such  that  Wa  h  cFcr  ~ 
dW a.  A  solution  to  a  system  S  of  rigid  w-equations  is  any  word  substitution 
that  is  a  solution  to  every  rigid  w-equation  in  S. 

Note  that  a  ground  w-equation  is  also  an  ordinary  equation. 

In  Lemma  9  below  we  show  that  one  can  consider  systems  of  rigid  w-equations 
instead  of  systems  of  rigid  equations.  The  following  technical  lemma  is  proved 
in  [6]: 

Lemma  9  The  problem  of  solvability  of  systems  of  rigid  w-equations  is  polyno¬ 
mial-time  reducible  to  monadic  SREU.  Monadic  SREU  is  effectively  reducible  to 
the  problem  of  solvability  of  systems  of  rigid  w-equations. 


3  Ground  left-hand  sides 

In  this  section  we  prove  that  monadic  SREU  with  ground  left-hand  sides  is 
decidable  and  PSPACE-hard. 


SREU  with  ground  left-hand  sides  is  PSPACE-hard. 

Lemma  10  Let  A  —  {Q,I,T,E)  be  a  deterministic  finite  automaton  over  T. 
There  exists  a  system  IZ  of  two  monadic  rigid  equations  of  one  variable  x  with 
the  following  properties: 

1.  7Z  has  ground  left-hand  sides; 

2.  for  every  solution  6  to  TZ  we  have  x6  =  cW ,  where  W  ^  T*  and  c  is  a  fixed 
constant; 

3.  for  any  word  W  ^  T ,  the  substitution  {cW/x}  is  a  solution  to  7Z  if  and  only 
if  W  is  recognized  by  A. 

In  addition,  IZ  can  be  effectively  constructed  from  A  using  a  polynomial-time 
algorithm. 

Proof.  Without  loss  of  generality  we  can  assume  that  I  consists  of  one  state  (see  e.g. 
[12]).  By  renaming  states,  we  can  assume  that  I  =  {c}.  Let  F  be  a  unary  function 
symbol  fresh  for  T  and  d  be  a  constant  fresh  for  Q.  Define  TZ  as  {i?i, R2},  where 

=  {p/  ~  g  I  (p  4  g)  e  U  {cF  ~  d  I  r  6  T}  hv  xf  ~  (i 
Ri  — 

Consider  any  substitution  9  =  {t/x}.  By  Lemma  6,  ^  is  a  solution  to  R2  if  and  only 
if  t  has  the  form  cW  such  that  W  ^  T* .  Consider  when  such  substitution  {cW/x}  is 
also  a  solution  to  Ri.  By  definition,  this  means 

{p/  -  ^  I  (p  4  G  F}  U  {rF  ~  d  I  r  6  T}  h  cWF  ~  d  (1) 

Since  the  automaton  is  deterministic,  the  left-hand  side  of  (1)  is  irreducible.  Using 
Lemma  8,  one  can  see  that  (1)  holds  if  and  only  if  W  is  recognizable  by  A.  Evidently, 
TZ  is  constructed  by  A  in  polynomial  time. 


161 


Lemma  11  The  intersection  nonemptyness  of  deterministic  finite  automata 
problem  is  polynomial-time  reducible  to  monadic  SREU  with  one  variable  and 
ground  left-hand  sides. 

Proof.  Let  Ai, An  be  deterministic  finite  automata.  Let  where  i  €  {1, . . . ,  n} 
be  the  system  of  rigid  equations  constructed  by  Ai  as  in  Lemma  10.  Define  7^  = 
U”-i  Lemma  10,  every  solution  to  TZ  has  the  form  {cW/x]  and  any  substitution 

{cW/x.}  is  a  solution  to  TZ  if  and  only  if  VL  is  recognized  by  each  Ai.  Hence,  TZ  is  solvable 
if  and  only  if  there  is  a  word  recognizable  by  all  Ai.  Evidently,  hZ  is  constructed  by 
Ai,. . .  .,An  in  polynomial  time. 

Combining  Lemmas  3  and  11  we  obtain 

Theorem  1  Monadic  SREU  with  one  variable  and  ground  left-hand  sides  is 
PSPACE-hard. 

Monadic  SREU  with  ground  left-hand  sides  is  decidable.  A  finite  set  8 
of  equations  is  in  the  automaton  form  iff 

1.  every  equation  in  8  has  the  form  cf  ~  d; 

2.  for  every  two  w-equations  cf  c::!  di  and  cf  ~  ^2  in  we  have  di  =  ^2; 

Note  that  any  set  of  equations  in  the  automaton  form  is  irreducible.  The 
following  statement  is  proved  in  [6]: 

Lemma  12  Given  any  rigid  w-equation  S  with  ground  left-hand  side,  one  can 
effectively  find  in  polynomial  time  a  rigid  w-equation  S'  with  ground  left-hand 
side  such  that 

1.  S  and  S'  have  the  same  solutions; 

2.  the  left-hand  side  of  S'  is  in  the  automaton  form. 

Let  ^  be  a  set  of  equations  in  the  automaton  form  and  c,d  he  any  constants. 
Denote  by  A{8,  c,  d)  the  following  automaton  (Q,  I,  T,  E).  Its  alphabet  is  the  set 
of  function  symbols  occurring  in  8.  The  set  of  states  Q  is  the  set  of  all  constants 
occurring  in  8,c,  d.  The  sets  of  initial  states  and  terminal  states  are  defined  by 
/  ^  {c}  and  T  ^  {d}.  Finally,  the  set  of  edges  is  defined  by 

E^{aAb\{af-b)€  £}. 

Lemma  13  A  word  W  is  recognized  by  A{8,c,d)  if  and  only  if  8  \-  cW  ~  d. 
Proof.  Immediate  by  Lemma  8. 

Lemma  14  Let  8  be  a  set  of  equations  in  the  automaton  form,  W,  W  €  T*  and 
c,  c'  be  constants.  Then  8  h  cW  c'W  if  and  only  if  there  is  a  constant  d  and 
words  U,  U',  V  such  that  W  =  UV,  W  =  U'V,  U  is  recognized  by  A{8,c,d)  and 
U'  is  recognized  by  A{8,c' ,d). 
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Proof. 

(=>)  We  have  £  h  cW  ~  c'VF'.  By  Lemma  8  we  have  cW  l.£=  c'W'  4-5  •  Choose  d  and  V 
such  that  cW  l£=  dV.  Define  U  and  U'  such  that  W  =  UV  and  W'  —  U'V.  We 
have  £\-  cU  c:^d  and  5  h  c  C/'  ~  d.  By  Lemma  13  words  U  and  U'  are  recognized 
by  A{£,c,d)  and  A{£,c  ,d),  respectively. 

(<i=)  We  have  W  -  UV,  W  =  U'V,  U  is  recognized  by  A{£,c,d)  and  U'  is  recognized 
by  A{£,c,d).  By  Lemma  13  we  have  S  cU  c:^  d  and  £  h  cU'  ~  d.  Hence, 
£  h  cUV  ~  dV  and  £  h  dU'V  ~  dV .  Then  £  H  cUV  ~  c'U'V,  i.e.  £  h  cW  dW' . 

□ 


Lemma  15  The  problem  of  solvability  of  systems  of  rigid  w-equations  with  gro¬ 
und  left-hand  sides  effectively  reduces  to  word  equations  with  regular  constraints. 

Proof.  Let  S  =  {^i,. ..  ,5n}  be  such  a  system  of  rigid  w-equations.  By  Lemma  12 
we  can  assume  that  the  left-hand  sides  of  all  Si  are  in  the  automaton  form.  Let  Si  = 
{£2  by  CiWi  ~  diW'),  for  all  i  6  {1, . . .  ,n}.  Let  ui,. ..  ,Un,  vi,. . .  ,Vn  and  u'l, ...  ,Un  be 
word  variables  fresh  for  S.  By  Lemma  14,  the  system  S  is  solvable  if  and  only  if  there 
are  constants  di  occurring  in  Si,  for  all  i  G  {1, . . .  ,n}  such  that  the  following  system 
of  word  equations  and  regular  constraints  is  solvable: 

Wi  ~  uivi  ui  is  recognized  by  A(£i,ci,di) 


Wn  —  UnVn  Un  iS  reCOgnized  by  A{£n,Cn,dn) 

W[  C::L  u'lVi  u'l  is  recognized  by  A{£i,di,di) 

W'n  ~  u'nVn  IS  reCOgnizcd  by  A{£n,dn,dn) 

To  conclude  the  proof  we  note  that  there  is  only  a  finite  number  of  choices  for  di. 

Theorem  2  Monadic  SREU  with  ground  left-hand  sides  is  decidable. 

Proof.  By  Lemma  9  monadic  SREU  with  ground  left-hand  sides  is  effectively  reducible 
to  the  problem  of  solvability  of  systems  of  rigid  w-equations.  By  Lemma  15  the  latter 
problem  is  effectively  reducible  to  word  equations  with  regular  constraints.  Then  apply 
Lemma  4. 


4  One-variable  case 

In  this  section  we  consider  rigid  equations  with  one  variable  x.  We  shall  write 
£{x)  to  denote  all  occurrences  of  a  variable  x  in  S,  and  write  £{t)  to  denote  the 
set  of  equations  obtained  from  £  by  replacement  of  all  occurrences  of  x  by  t. 
We  shall  use  similar  notation  for  terms,  for  example  s(a;).  Using  this  notation, 
we  can  write  any  rigid  equation  of  one  variable  x  as  £(x)  hy  s{a:)  t{x).  The 

following  statement  is  proved  in  [6]: 

Lemma  16  Let  £{x)  be  a  finite  set  of  equations  of  one  variable  x  and  s{x),t{x) 
be  terms  of  one  variable  x  such  that  £(x)  \f  s(a:)  ~  t{x).  Let  c  be  a  constant  fresh 
for  £{x),s{x),t{x)  and  r  be  a  ground  term  such  that  c  does  not  occur  in  r.  If 
£{r)  h  s(r)  ~  t{r),  then  there  exists  a  ground  term  r'  G  T{£(c)  U  {s(c)  ~  ^(c)}) 
such  that  £’(c)  b  r  ~  r'. 
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Lemma  17  Let  L{x)  hy  5{x)  ~  t{x)  he  a  rigid  equation  of  one  variable  x,  c  be  a 
constant  fresh  for  this  rigid  equation,  r  be  a  ground  term  in  which  c  does  not  occur 
and  E(x)  \f  s(a;)  ~  t{x).  Then  the  substitution  9  =  {rlx}  is  a  solution  to  this 
rigid  equation  if  and  only  if  there  is  a  ground  term  r'  G  T{S{c)  U  {^(c)  ~  t{c)}) 
such  that  8{c),  E{r')  h  s(r')  ~  t{r')  and  9  is  a  solution  to  S{c)  hy  r'  ~  x. 

Proof. 

=>  We  have  that  0  is  a  solution  to  E{x)  hy  2::  t{x).  Then  £{r)  h  s(r)  ~  t{r).  By 
Lemma  16  there  is  a  term  r'  6  T{8{c)  U  {5(c)  ~  i(c)})  such  that  E{c)  h  r  ~  r'. 
Then  E{r),E[c)  h  s{t')  :^t{r'). 

We  have  E(c),E{r')  h  s(r')  ~  t{r')  and  E{c)  hy  r'  ~  r.  Then  E{c),E(r)  h  s{r)  ~ 
t(r).  By  Lemma  2  we  can  substitute  r  for  c  obtaining  E{r)  h  5(r)  ~  i(r).  □ 

Lemmas  16  and  17  also  hold  for  non-monadic  signatures  [3]. 

Lemma  18  Monadic  SREU  with  one  variable  is  in  PSPACE. 

Proof.  We  shall  give  a  non-deterministic  algorithm  reducing  monadic  SREU  with  one 
variable  to  the  intersection  nonemptyness  of  deterministic  finite  automata  problem. 

Let  77  be  a  system  of  rigid  equations  of  one  variable  x  whose  signature  is  {P,  C).  It 
has  the  form 


by  Si(a:)  ~  •••  En  ^\/  Sn(x)  tn{x) 

By  Lemma  5  we  can  restrict  ourselves  to  relevant  grounding  solutions  9  =  {r/x}  only. 
Let  c  be  a  variable  fresh  for  {P,C).  By  Lemma  17  0  is  a  solution  to  77  if  and  only  if 
there  are  ground  terms  r'  G  T{Ei{c)  U  {si(c)  2:;  L(c)}),  where  i  G  {1, . . . ,  n}  such  that 
E(c),  E(r')  h  s{r')  ~  t(r')  and  ^  is  a  solution  to  the  system 

^1  (c)  by  r'l  ~  X  •••  ^n(c)  by 

Nondeterministically  select  such  rj, . . . ,  and  verify  the  condition  E(c),E (r')  b  s{r')  ~ 
t{r')  (it  can  be  checked  in  polynomial  time  using  Lemma  1). 

Such  ^  is  a  solution  to  this  system  of  rigid  equations  if  and  only  if  there  is  a  constant 
d  eC  such  that  the  following  system  of  rigid  w-equations  is  solvable: 

Ei{c)  by  r'l  ca  dx  ■  •  •  En{c)  by  —  dx 

Nondeterministically  select  such  d.  By  Lemma  12  we  can  equivalently  replace  this  sys¬ 
tem  with  a  system 


£[  by  Cl  ~  diX  •  •  •  E'^ri^y  <^n—  dnX 

where  £[  are  in  the  automaton  form.  By  Lemma  13,  this  system  is  solvable  if  and  only 
if  the  intersection  of  automata  . . .  ,A(En^dn,Cn)  is  non-empty. 

We  have  given  a  non-deterministic  algorithm  reducing  monadic  SREU  with  one 
variable  to  the  intersection  nonemptyness  of  deterministic  finite  automata  problem. 
On  each  branch,  the  algorithm  makes  polynomially  many  steps.  Applying  Lemma  3 
on  the  complexity  of  the  intersection  nonemptyness  of  deterministic  finite  automata 
problem  we  get  that  monadic  SREU  with  one  variable  is  in  NPSPACE,  and  hence  in 
PSPACE. 

Combining  Theorem  1  and  Lemma  18,  we  obtain 
Theorems  Monadic  SREU  with  one  variable  is  PSPACE-complete. 
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5  General  case 

Denote  by  W  the  set  of  pairs  of  words  on  T.  Introduce  on  W  a  binary  function 
a  unary  function  ^  and  a  binary  relation  <  in  the  following  way: 


[UuU2)^{VuV2) 


{U2W,  V2)  if  Vi  has  the  form  UiW 
{Vi,V2)  otherwise 


iUuU2r  ^  {U2M1) 

{Ui,U2)  <  (Vi,  14)  there  is  a  word  W  such  that  (Vi,  V2)  —  {UiW,  U2W) 

An  ideal  on  W  is  any  set  of  pairs  containing  (e,e)  and  closed  under  *,  and 
upward  closed  under  <.  The  ideal  generated  by  a  set  of  pairs  S,  denoted  ideal  {S) 
is  defined  as  the  least  ideal  containing  S. 

An  ideal  equation  is  an  expression 

{U,V)  e  ideal{{{Ui,Vi),. . .  ,{Un,Vn)]), 

where  n  >  0  and  C/,  F,  C/i t/n,  Vi 14  €  V)*.  A  solution  to  such  ideal 

equation  is  any  word  substitution  a  such  that 

1.  words  Ua,  Va,  Uicr, Uncr,  Via, . . . ,  14^^  are  words  over 

2.  the  word  (Ua,  Va)  belongs  to  the  ideal  generated  by 

{{Uia,Via),...,{Una,Vna)}. 

A  system  of  ideal  equations  is  any  finite  set  of  ideal  equations.  Solutions  to 
a  system  of  ideal  equations  are  substitutions  that  solve  each  equation  in  the 
system.  The  ideal  equations  problem  is  the  decision  problem  of  solvability  of 
systems  of  ideal  equations.  The  aim  of  this  section  is  to  show  that  monadic 
SREU  is  equivalent  to  the  ideal  equations  problem. 

The  following  lemma  proved  in  [6]  is  the  main  reason  for  introducing  the 
notion  of  an  ideal. 

Lemma  19  Let  Ui, . . .  ,Un,Vi, . . . ,  14,  U,  V  be  words  on  T  and  a  be  any  com 
slant.  Then  alJi  ~  aVi,...,aUn  ^  al4  ^  all  aV  if  and  only  if  {U,V)  e 
ideal{{{Ui,Vi),...,{UM}). 

Theorem  4  Monadic  SREU  is  decidable  if  and  only  if  the  ideal  equation  problem 
is  decidable. 

Proof.  See  [6]. 

Technical  report  [6]  discusses  ideal  equations  in  more  detail.  In  particular,  it 
is  shown  that  ideal  equations  are  decidable  if  and  only  if  word  equations  extended 
by  a  family  of  predicates  behaving  like  a  greatest  common  divisor  on  word  are 
decidable.  In  addition,  the  following  statement  is  proved: 

Lemma  20  Ideal  equations  are  decidable  if  and  only  if  ideal  equations  with  reg¬ 
ular  constraints  and  the  inequality  constraints  U  qkV  are  decidable. 
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Abstract.  While  computability  theory  on  many  countable  sets  is  well 
established  and  for  computability  on  the  real  numbers  several  (mutually 
non-equivalent)  definitions  are  applied,  for  most  other  uncountable  sets, 
in  particular  for  measures,  no  generally  accepted  computability  concepts 
at  all  have  been  available  until  now.  In  this  contribution  we  introduce 
computability  on  the  set  M  of  probability  measures  on  the  Borel  sub¬ 
sets  of  the  unit  interval  [0;  l].  Its  main  purpose  is  to  demonstrate  that 
this  concept  of  computability  is  not  merely  an  ad  hoc  definition  but  has 
very  natural  properties.  Although  the  definitions  and  many  results  can 
of  course  be  transferred  to  more  general  spaces  of  measures,  we  restrict 
our  attention  to  M  in  order  to  keep  the  technical  details  simple  and  con¬ 
centrate  on  the  central  ideas.  In  particular,  we  show  that  simple  obvious 
reqirements  exclude  a  number  of  similar  definitions,  that  the  definition 
leads  to  the  expected  computability  results,  that  there  are  other  nat¬ 
ural  definitions  inducing  the  same  computability  theory  and  that  the 
theory  is  embedded  smoothly  into  classical  measure  theory.  As  back¬ 
ground  we  consider  TTE,  Type  2  Theory  of  Effectivity  [KW84,  KW85], 
which  provides  a  frame  for  very  realistic  computability  definitions.  In 
this  approach,  computability  is  defined  on  finite  and  infinite  sequences 
of  symbols  explicitly  by  Turing  machines  and  on  other  sets  by  means 
of  notations  and  representations.  Canonical  representations  are  derived 
from  information  structures  [Wei97]  .  We  introduce  a  standard  represen¬ 
tation  tim  :C  ^  M  via  some  natural  information  structure  defined 

by  a  subbase  a  (the  atomic  properties)  of  some  topology  r  on  M  and 
a  standard  notation  of  a.  While  several  modifications  of  8m  suggesting 
themselves  at  first  glance,  violate  simple  and  obvious  requirements,  8m 
has  several  very  natural  properties  and  hence  should  induce  an  impor¬ 
tant  computability  theory.  Many  interesting  functions  on  measures  turn 
out  to  be  computable,  in  particular  linear  combination,  integration  of 
continuous  functions  and  any  transformation  defined  by  a  computable 
iterated  function  system  with  probabilities.  Some  other  natural  repre¬ 
sentations  of  M  are  introduced,  among  them  a  Cauchy  representation 
associated  with  the  Hutchinson  metric,  and  proved  to  be  equivalent  to 
8m-  As  a  corollary,  the  final  topology  r  of  8m  is  the  well  known  weak 
topology  on  M. 
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1  Introduction 

Measure  and  integration  is  a  central  branch  of  mathematics  pervading  almost 
all  parts  of  abstract  analysis.  Several  authors  have  already  considered  ques¬ 
tions  of  effectivity,  constructivity,  computability  or  computational  complexity 
in  measure  or  integration  theory.  Kushner  [Kus85]  studies  computability  and 
Ko  [Ko91]  computational  complexity  of  integration.  Bishop  and  Bridges  [BB85] 
present  constructive  measure  theory  extensively.  Although  they  do  not  consider 
computability,  certainly  many  of  their  concepts  and  results  have  computational 
counterparts.  Edalat  gives  a  domain  theoretic  approach  to  elfective  integration 
[Eda95,  Eda96].  He  also  does  not  consider  computability,  but  it  should  be  pos¬ 
sible  to  extend  his  topological  approach  by  computability  concepts.  Traub  et  al. 
[TWW88]  investigate  the  computational  complexity  of  numerical  algorithms  for 
integration  in  the  real  number  model  of  computation.  However,  this  model  is 
unrealistic  in  many  situations  and  therefore  not  generally  accepted.  A  system¬ 
atic  study  of  computability  in  integration  and  measure  theory  does  not  yet  exist. 
In  this  paper  we  introduce  a  very  natural  and  realistic  computability  theory  on 
probability  measures.  We  achieve  this  by  extending  TTE,  Type  2  Theory  of  Ef¬ 
fectivity,  to  measure  theory.  TTE  has  been  introduced  by  Kreitz  and  Weihrauch 
[KW84,  KW85]  as  a  general  framework  for  studying  effectivity,  i.e.  continuity, 
computability  and  computational  complexity,  in  Analysis.  For  details  the  reader 
is  referred  to  the  introduction  [Wei95]  and  a  recent  short  survey  [Wei97]  con¬ 
taining  most  of  the  notations  we  shall  use  in  this  paper.  More  details  can  be 
found  in  [KW85,  Wei87].  Since  this  paper  is  a  first  attempt,  we  consider  only 
the  space  of  probability  measures  on  the  Borel  subsets  of  the  real  unit  interval. 

By  /  :C  A  — >  B  we  denote  a  partial  function,  i.e.  a  function  from  a  subset 
of  A  to  B.  Throughout  this  paper  let  be  a  sufficiently  large  finite  alphabet. 
Let  A’*  be  the  set  of  finite  and  =  {p  \  p  :  co  — ^  B]  the  set  of  '‘e^a-words 

over  B.  On  B*  we  consider  the  discrete  topology  and  on  B^  the  cantor  topol¬ 
ogy  defined  by  the  basis  {re  17'^  |  lu  G  B*}.  For  Yo,Yi, . . .  ,Yk  E  {B*,B‘^},  a 
function  /  ;C  Yi  x  . . .  x  ^  Yo  is  called  computable,  iff  it  is  computed 

by  a  Turing  machine  with  a  one-way  output  tape.  Every  computable  function 
is  continuous.  The  basic  idea  of  TTE  is  to  use  finite  or  infinite  sequences  as 
names  of  “abstract”  objects.  As  naming  systems  we  consider  notations,  i.e.  sur¬ 
jections  ly  :C  B*  — ^  S,  and  representations,  i.e.  surjections  6  :C  B^  — ^  M. 
Continuity  and  computability  concepts  are  transferred  from  B*  and  17“'  via  no¬ 
tations  and  representations,  respectively,  to  the  named  sets  straightforwardly, 
see  [KW85,  Wei87,  Wei95,  Wei97].  Mainly  notations  or  representations  which 
are  compatible  with  some  relevant  structure  on  the  set  under  consideration  are 
of  practical  interest.  We  do  not  discuss  this  for  notations  (see  [RW80,  Wei87]  and 
Appendix  C  in  [Wei95]),  but  we  will  introduce  “effective”  notations  explicitly 
whenever  necessary.  In  particular,  for  the  rational  numbers  let  uq  :<Z  B*  — ^  Q 
be  the  standard  representation  via  fractions  of  integers  in  binary  notation.  We 
shall  abbreviate  iyQ{‘w)  by  lu.  Standard  notations  of  the  natural  numbers,  pairs 
of  rational  numbers  etc.  will  be  used  without  further  definitions.  For  uncount¬ 
able  sets  M  we  shall  consider  mainly  representations  derived  from  “information 
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structures’’  (M,  cr,  i^),  where  cr  is  a  countable  subset  of  2^  of  “atomic  proper¬ 
ties”  which  identifies  points,  and  ly  is  a  notation  of  a  [Wei97].  It  is  assumed  that 
a  computer  (Turing  machine)  manipulates  //-names  of  atomic  properties.  As  a 
name  of  an  object  x  G  M  we  consider  any  infinite  list  of  all  properties  A  e  (t 
which  hold  for  x.  Concretely,  the  standard  representation  6^,  :C  17“’  — >  M  is 
defined  by 

S^{p)  -  X  <;==>  p  -  h’  €  cj}  =  {in  1  a?  G 

Every  finite  prefix  of  a  6;, -name  p  of  a;  contains  finitely  many  atomic  properties 
of  X  which  “approximate”  x.  Mathematically,  this  kind  of  approximation  is  de¬ 
scribed  by  the  topology  on  M,  which  has  o'  as  a  subbase.  Computability  on  cr 
and  via  6,,  on  M  are  fixed  by  the  notation  //  which  expresses  how  atomic  prop¬ 
erties  can  be  handled  concretely.  Thus,  for  any  information  structure  (Af,  cr, //), 
cr  characterizes  approximation  and  //  computability  on  M .  The  topology  and 
the  standard  representation  6i,  are  closely  related:  A  G  I'a  — k  X  is  open  in 
clom(S^)  (for  all  A  C  M),  i.e.  is  the  final  topology  of  Let  6  :C  X*  — >  M 
and  6'  :C  X*  — ^  M'  be  representations  and  let  /  :C  M  — be  a  function. 
An  elenient  ;r  G  M  is  called  (5-computable,  iff  6{p)  ^  x  for  some  computable 
sequence  p  e  X* .  By  definition,  6  <t  S'  {6  <  S'),  ^  S  =  S'g  for  some  con¬ 
tinuous  (computable)  function  g  :C  X*  — ^  X* ,  and  /  is  (6, 6')-continuous 
(-computable),  iff  fS  -  S'g  for  some  continuous  (computable)  function  g  :C 

w*  _ ,  wx'  (Accordingly  for  functions  with  two  or  more  arguments.)  By  the 

“main  theorem  for  admissible  representations”  [KW85]  a  function  is  continuous 
relative  to  standard  representations,  iff  it  is  continuous  w.r.t.  the  associated  final 
topologies  in  the  usual  sense.  For  more  details  see  [KW85,  Wei87,  Wei95,  Wei97]. 
For  the  real  numbers,  we  need  three  representations  /?<,p>,/?  :C  X^  — ^  IR, 
derived  from  information  structures.  They  can  be  defined  explicitly  as  follows 
[Wei87,  Wei97]: 

p<(p)  =x  :  <=>  p  =  u;oi}iyiil...  with  (u/i  \  i  e^}  =  {w  \  w<  x}, 
p>(p)  =  X  :  p  =  •  •  with  {u/f  |  i  G  u;}  =  {u;  |  u)  >  x}, 

p(p)  =  X  :  <=>  p  =  •  •  •  with  {(vi,  Wi)  \  i  G  w}  =  {(v,  u;)  |  c  <  x  <  t/;}. 

The  final  topologies  are  r<  =  {{y;oo)  \  y  G  M}  U  {H},  r>  =  {(— oo;y)  1  y  G 
IR}U{IR}  and  the  set  m  of  ordinary  open  subsets  of  IR,  respectively.  Notice  that 
p  induces  the  standard  computability  theory  on  the  real  line.  The  translatability 
or  reducibiiity  properties  [Wei87,  Wei97]  P  ^  P<->  P  ^  P>i  P<  it  P^  P>  P’ 
P<  P> )  P>  P<  proved  easily. 

In  Section  2  we  introduce  a  standard  representation  Sm  of  the  set  M  of 
probability  measures  on  the  Borel  sets  of  the  interval  [0;  1]  by  a  very  natural 
information  structure.  \Ve  prove  a  stability  theorem  for  this  definition.  We  dis¬ 
cuss  some  further  modifications  of  the  definition  and  show  that  that  they  have 
undesirable  properties.  The  results  indicate  that  the  computability  theory  on  M 
induced  by  the  representation  Sm  is  indeed  very  natural.  In  Section  3  we  prove 
computability  of  several  interesting  functions  on  measures,  in  particular  linear 
combination  and  integration  of  continuous  functions.  Also  the  measure  trans¬ 
formation  induced  by  a  computable  iterated  function  system  with  probabilities 
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[HutSl,  Bar93]  is  computable.  Finally  in  Section  4,  we  introduce  representations 
based  on  other  natural  information  structures  and  a  Cauchy  representation  for 
the  Hutchinson  metric  [HutSl,  Bar93]  .  We  prove  that  all  these  representations 
are  equivalent  and  that  their  final  topology  is  the  well  known  weak  topology 
[Bau74]. 


2  The  standard  representation  of  measures 

In  this  section  we  introduce  the  standard  representation  Sm  of  the  probability 
measures  and  show  that  it  induces  a  very  natural  computability  theory.  Let 
Int  :=  {(a;  6),  [0;  a),  (6;  1],  [0;  1]  |  a,  6  G  Q,  0  <  a  <  6  <  1}  be  the  set  of  open 
subintervals  of  [0;  1]  with  rational  boundaries,  and  let  /  ;C  E*  — ^  Int  be  some 
standard  notation  of  Int  with  dom{I)  C  (X’\{b,  J})*.  We  write  lyj  for  I{w).  By  B 
we  denote  the  set  of  Borel  subsets  of  [0;  1],  i.e.  the  smallest  <j-algebra  containing 
Int.  By  M  we  denote  the  set  of  probability  measures  /i  :  B  — >  IR  on  the 
space  ([0;  1],B).  By  a  basic  theorem  of  measure  theory  [Bau74],  every  measure 
yu  G  M  is  defined  uniquely  by  its  values  on  the  generating  set  Int.  We  introduce 
a  standard  representation  of  M  via  an  information  structure.  The  informations 
available  from  some  standard  name  of  a  measure  n  shall  be  all  (r,  J)  with  7’  G  Q 
and  J  G  Int  such  that  r  < 

Definitionl.  Define  an  information  structure  t/)  by  a  range  (i/)  , 

where  j.i  G  u  <  ^(ly)  for  all  u  G  dom{nQ),  v  G  dom.{I)  and 

//  G  M.  Let  Tm.  be  the  topology  on  M  with  subbase  a  and  let  6^.  be  the  standard 
representation  of  M  derived  form  u. 

It  remains  to  show  that  a  identifies  the  points  of  M.  Consider  measures  /.i,  /.d  G  M 
such  that  r  <  <1=:^  7’  <  for  all  r  G  Q  and  J  G  Int.  Then  obviously, 

J)  fj,'{  J)  for  all  J  G  Int,  i.e.  /i  =  fi' .  The  definition  of  the  representation  6^ 

looks  somewhat  arbitrary.  By  the  next  stability  lemma,  we  obtain  an  equivalent 
representation,  if  we  replace  uq  and  I  by  adequate  other  notations.  For  any 
A'  C  IR  let  c/6'(A^)  be  the  closure  of  X . 

Lemma  2.  (stability  of  6m)  Let  1/$  ’C  E*  — ^  S  be  a  notation  of  a  set  S  which  is 
dense  in.  IR  such  that  {(77,  u)  |  7^5(17)  <  and  {(n,?;)  |  nQ{u)  < 

r.e.  .  Let  D  be  a  countable  dense  subset  of[0,  1]  and  let  I'  be  a  notation  of  Inf  := 
{(o;  6),  [0;  a)(a.;  1],  [0;  1]  |  a,  6  G  D,  0  <  a  <  6  <  1}  such  that  {(u,  u)  |  cls{I(  )  C 
7^, }  and  {(77,7;)  |  c/s(7„)  C  7'}  are  r.e.  Define  and  by  substituting  ns  for 
nq  and  V  for  I  in  Definition  1. 

Then  =  r,n  and  7;,  =  6m.- 

If  we  replace,  for  example,  rational  numbers  by  finite  binary  fractions  or  by  finite 
decimal  fractions  in  the  definition  of  the  set  Int  and  in  Definition  1,  we  obtain 
an  equivalent  representation  with  the  same  final  topology. 

If  we  replace  the  relation  in  Definition  1  by  “<” ,  or  ,  we  obtain 
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rerpesentations  which  violate  Lemma  2.  Remember  that  by  definition,  the  topol¬ 
ogy  has  the  subbase  (j  —  {Ur^j  |  r  6  Q  and  J  6  Int}  where  Ur^j  =  {/i  € 
M  I  r  <  //(J)}  .We  prepare  the  proof  of  the  theorem  by  two  lemmas.  First,  we 
consider  the  cases  “r  > 

Lemma  3.  For  Q  CM  let  t(Q)  be  the  topology  on  M  generated  by  the  snbbase 
^(Q)  ■■=  {^r,J  lrCQ,J  e  Intj,  where  l/r,j  =  C  M  I  r  <  p(J)}. 

Then  t{P)  g  t{Q),  tfteP\Q  for  some  t  C  (0;  1)  (for  all  P,Q  C  IR;. 

The  statement  holds  accordingly,  if  is  replaced  by 

The  next  lemma  considers  the  case  “r  > 

Lemma 4.  For  D  C  (0;  1)  let  Int(D)  :=  {(a;  6),  [0;  a),  (a;  1],  [0;  1]  |  a,  6  E  0  < 
a  <  b  <  1}.  Let  r{D)  be  the  topology  on  M  generated  by  the  szibbase  o-(D) 
{Ur,j  I  1'  e  Q, '/  G  Int  (D)}  ivhere  Ur,j  -  {f-i  C  M  \  r  >  Then 

DCE  ^  riD)  C  t{E)  (for  all  D,EC  (0;  1)  ). 

Theorems.  If  in  Definition  1  the  relation  '‘u  <  Jjflv)”  is  replaced  by  %.  > 
p(Iv)'\  ''u  >  ^(4)”  or  %  <  resulting  representations  Sm  violate  the 

stability  lemma  2. 

By  Definition  1  and  Lemmata  3  and  4,  many  different  more  or  less  natural 
representations  and  hence  computability  theories  for  the  set  M  of  probability 
measures  on  ([0;1],B)  can  be  introduced.  The  “user”  has  to  decide,  which  of 
them  is  adequate  for  his  application.  The  stable  representation  from  Defini¬ 
tion  1  is  certainly  the  most  important  one,  since  its  computability  theory  will 
occur  most  frequently.  We  shall  study  it  in  the  following  exclusively. 

As  a  simple  consequence  of  Definition  1,  all  rational  lower  bounds  of  /i(/)  can 
be  obtained  from  any  ^„7,-name  of  p.  and  any  /-name  of  J.  This  property  char¬ 
acterizes  the  representation  8m  except  for  equivalence:  The  representation  8m  is 
<-complete  in  the  set  of  all  representations  8  of  M,  for  which  (/i,  J)  ^  i® 
{8, 1,  /9<  )-computable. 

Theorem  6.  For  any  representation  8  u/M;  8  <  8m  is 

(8, 1,  p^) -computable. 

Notice,  that  in  particular  (/i,  J)  p{J)  is  /,  p<)-computable.  Computing 
only  lower  rational  bounds  does  not  seem  to  be  satisfactory.  We  would  like  to 
compute  also  arbitrarily  close  upper  bounds  of  p{Iv)-  We  prove  a  negative  and 
a  positive  answer.  For  any  x  E  [0;  1]  define  /ij,  E  M  by  px{L^)  '=  (1  if  3^  C  A,0 
otherwise).  For  any  good  and  useful  representation  ^  of  M  it  should  be  possible 
to  determine  a  (5-name  of  the  measure  px  effectively  from  a  name  of  x.  Let 
M'  :^{px\xC[0;l]}. 

Theorem  7.  For  any  representation  8  (3/M,  for  which  x  px  is  (p,  8)-continuous 
on  (0;  1),  p  ^  p[0;  1/2)  is  not  {8,  py)-continuous  on  M' .  8m  is  such  a  represen¬ 
tation. 
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Therefore,  for  reasonable  representations  S  of  M,  in  particular  for  our  stan¬ 
dard  representation  6^,  arbitrarily  close  rational  upper  bounds  of  measures 
of  open  intervals  cannot  be  computed.  Although  this  contradicts  intuition  at 
first  glance,  it  has  to  be  accepted  as  a  matter  of  fact.  Notice,  that  for  prov¬ 
ing  Lemma  3,  Lemma  4  and  Theorem  7  we  have  used  measures  //  G  M  with 

>  0  for  some  a:  E  IR.  Since  the  arguments  have  been  purely  topological 
without  reference  to  computability,  we  have  also  shown  that  the  final  topology 
Tm  of  the  representation  6m,  which  formalizes  a  concept  of  “approximation” 
on  the  set  M  of  measures,  is  quite  natural.  If  we  exclude  measures  with 

>  0  for  some  x  E  [0;  1],  (/i,J)  //(/)  becomes  ((5„a ,  7,  p)-computable. 

Let  MO  {/i  E  M  I  Vo?  E  [0;  l].fi{x}  =  0} 

Theorems.  The  function  (/i,  7)  ^  fJ-iJ)  {Sm,  I ,  p)- computable  for  J  E  Int 

and  f.L  E  M*^. 

3  Computable  Functions  on  Measures 

In  this  section  we  prove  computability  of  some  interesting  functions  on  prob¬ 
ability  measures.  By  the  next  theorem,  the  linear  combination  of  measures  is 
computable  in  all  variables. 

Theorem 9.  The  function  [a,  fj,')  i— s- a^+(l— a)/i^  is  {p^6m,6m,6m)~computahle 

for  0  <  a  <  1 

By  Theorem  6,  (p,  J)  /i(7)  is  (Sm,  7,  p<)-computable  on  M  x  Int.  We  extend 
this  result  to  =  {f/n[0;l]  I  U  E  t^r},  the  set  of  all  open  subsets  of  [0;  1]. 
First  we  need  a  representation  of  this  topology.  For  the  set  tir  of  open  subsets  of 
IR,  the  following  information  structure  (r[R,(7,  i/)  and  its  derived  representation 
So  and  topology  To  are  natural  (see  [Wei97]):  For  any  U  ^  and  u,v  E  S* 
let  U  E  iy{upv)  iff  [u.;u]  C  U.  Consequently,  6o{p)  =  iff  p  is  a  list  of  all 
closed  intervals  with  rational  boundaries  contained  in  U .  We  define  our  standard 
representation  of  accordingly:  7' (p)  =  U  ;  p  is  a  list  of  all  w  E  U*  with 
cls(Iu>)  C  U  (p  E  ,U  E  r^).  Let  E  M  be  the  Lebesgue  measure  on 
([0;1],B). 

Theorem  10.  (1)  [p,U)  ^  p{U)  for  p  E  M  and  U  E  is  {6m,  6'^,  p<^)- 
com,putable.  (2)  {p,U)  ^  pt{U)  for  p  —  pi. l  and  U  E  is  not  (7m,<5o,P>)- 
continuous. 

For  uniform  formulations  in  the  next  theorems  we  need  a  standard  representation 
of  the  set  (7[0;  1]  of  continuous  functions  /  :  [0;  1]  — IR.  We  define  6^ 
and  the  corresponding  final  topology  by  the  following  information  structure 
(C'[0;  1],  cr,  z/):  /  E  iy{ulpv\:w)  :  u  <  f{clslv)  <  w  for  all  f  E  C'[0;  1] 

and  E  E* .  Properties  of  are  discussed  in  [Wei87,  Wei95,  Wei97].  In 

particular,  r_^  is  the  compact-open  topology  on  (7[0;  1],  which  is  also  generated 
by  the  metric  d{f,  g)  :=  max{|/(a:)— p(a:)||0  <  a?  <  1}  on  CfO;  1].  For  any  measure 
p  E  M  and  any  continuous  function  /  :  [0;  1]  — [0;  1]  define  the  measure  Tj{p) 
by  Tj{p){A)  pf~^{A)  for  every  Borel  set  A  C  [0;  1]  (see  [Bau74],  page  42). 
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Theorem  11.  The  function  ^  Tf{ii)  for  continuous  f  :  [0;  1]  ^  [0;  1]  and 

f.1  e  M  is  (S^,Smyhm)^computable. 

VVe  apply  this  theorem  to  iterated  function  systems  with  probabilities  [HutSl, 
Bar93].  An  interated  function  system  (IFS)  on  [0;  1]  with  probabilities  is  a  tuple 
S  ^  ([0;  where  /i,...,/*  :  [0;  1]  [0;  1]  are  continuous 

functions  and  pi, .  .  .,pk  are  positive  real  numbers  with  pi  +  . . .  +  Pfc  ==  1-  With 

k 

S  one  associates  the  function  Tg  :  M  — ^  M  defined  by  Tg(p)  ^  piTf^(p) 

i  =  l 

Corollary  12.  Let  S  =  ([0;  1],  /i, . . . ,  fk,Pi,  ■..,Pk)>>e  an  IFS  with  probabilities 
such  that  /i, .  .  .,/j;  are  6^  -computable  and  pi, . .  .,pk  are  p-computable.  Then 
Tg  :  M  — >  M  is  {Sm,  hm)-computable. 

Therefore,  for  any  computable  iterated  function  system  S  with  probabilities,  the 
associated  measure  transformation  Tg  :  M  — >  M  is  a  ^„^)-computable 
function.  We  shall  show  below  (Theorem  23)  that  its  unique  fixed  point  pg  G  M 
is  -computable,  if  the  system  S  is  hyperbolic  [HutSl].  We  shall  show  that  in¬ 
tegration  of  continuous  functions  is  computable  in  both  arguments.  The  integral 
of  a  continuous  function  can  be  defined  via  summations  over  finite  partitions. 
Consider  p  G  M  and  /  G  (7(0;  1].  Let  Part  be  the  set  of  ail  finite  partitions 
Z  of  [0;  1]  into  intervals  with  rational  boundaries  (remember:  (JZ  —  [0;  1]  and 

/  n  J  =  for  /,  J  G  Z).  For  Z  G  Part  define  5+(Z)  :=  /i(J)  •  sup  f{x)  and 

jez 

s_(Z)  Y  /(^)-  Since  /  is  continuous,  we  have  sup  s_(Z)  = 

x^J  ZePart 

inf  s+(Z)  =:  f  fda.  The  following  lemma  is  the  key  to  the  next  proof. 

ZePart  ^ 

Lemma  13.  For  any  /?,  7  >  0  there  are  a  finite  set  T  C  Ini  of  (pairwise  disjoint) 
open  intervals  and  a  finite  set  L  of  closed  intervals  such  that  T  U  T  G  Part, 

length).!)  <  7  for  every  J  ^  T  and  p([jZ/)  <  jS.  (L  can  be  chosen,  such  that 

each  J  G  T  has  length  O.j 

Theorem  14.  The  function  (/,  p)  J  f  dp  for  f  G  ^[0;  1]  and  p  G  M  is 

(6^,  6m,  p) -computable. 

Proof:  For  any  T  C  Int  let  s_(T)  :=  YiKJ)  *  I  7  G  T}.  Consider 

/  G  C'[0;  1]  and  £  >  0.  By  uniform  continuity  of  /  there  is  some  7  >  0  such 

that  |.T  -  p|  <  7  =>  \fx  -  fy\  <  e/4.  Let  M  :=  max{|/(j:)||0  <  x  <  1),  choose 
0  t/(4(l  -f  M)).  By  Lemma  13  there  is  some  set  T  C  Int  of  pairwise  disjoint 

intervals  such  that  1  -  ^  <  p IJT  <  1  and  VJ  G  T.length{J)  <  7.  Furthermore, 
there  are  2:j  G  Q  such  that  zj  <  p{J)  for  J  G  T  and  1—  0  <  Yl^J  I  G  T}  <  1. 
We  describe  a  procedure  for  determining  from  (p,  g,  n)  a  number  r  G  Q  with 
I?-  -  /  fdp]  <  where  6^{p)  =  /  and  Sm{q)  -  P- 

-  Fromp  and  n  determine  some  k  £  to  such  that  \x  —  y\  <  2"^  =>  \fx  —  fy\  < 
2-«-2  [Wei95,  Wei97]. 

-  From  p  determine  some  integer  upper  bound  m  of  M. 
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—  Let/?:=2-"-7(l  +  ”*)- 

—  By  systematic  search  find  a  finite  set  T  C  Int  of  pairwise  disjoint  intervals 
and  rational  numbers  zj  (J  G  T)  with  length(J)  <  '2~^  and  zj  <  fi{J)  for 
7  e  r  and  1  -  /?  <  Y^{zj  \JeT}. 

—  Determine  some  r  G  GJ  such  that  |  ■  inf  f{J)  \  J  G  T}  —  r|  <  2  ”  . 


The  existence  of  T  and  the  numbers  zj  has  already  been  shown.  We  prove 
|r  —  J  fdfi\  <  2~’'.  Let  L  be  the  set  from  Lemma  13  and  let  T'  :=  TU  L.  We 
have; 


|s_(r)-s-(T)| 


<\s+iT')-s.{r)\ 

<  |^{/r(7)(sup/(7)  -inf/(J))||J  GT'} 

<  2-'‘-2 

<J:ML) -mf  f{J)\J  €L} 

<  lj[J  L  ■  m 


<  fj  ■  m 

^  —  n  — 

\s-{T)  -  •  mf/(J)|J  G  T}\ <  E{(M^)  -  ^J)inf/(J) 

<  P  •  m 

<  2"”-2 


I  jgt} 


By  the  triangle  inequality  we  obtain  |  J  fdfi  -  r|  <  2 

There  is  a  computable  procedure  for  determing  r,  i.e.  there  is  some  computable 
function  g  :C  x  x  U*  — ^  U*  such  that  for  /  =  6_^(p),  fi-  = 
n  =  u  we  have  \v  -  J  fdpL\  <  2”"  where  v  =  q,  u).  Using  a  machine  for  g 
one  can  define  easily  a  machine  for  a  function  h  :C  x  — >■  such  that 

/  6^{p)d6rn{q)  =  ph{p,  q)  for  all  p  G  dom{5^)  and  q  G  dom{6m)^ 

□ 

As  a  corollary  of  Theorem  7,  Theorem  14  cannot  be  extended  from  ^[0;  1]  to 
the  measurable  functions,  not  even  to  step  functions. 

Corollary  15.  Lei  f  :  [0;  1]  — ^  IR  be  the  characteristic  function  o/ [0;  1/2). 
Then  ^  j  f dp  is  not  [8^,  py) -continuous  on  M. 


4  Further  Representations  of  Measures 

In  Definition  1  we  have  used  atomic  properties  r  <  p{J)  with  r  G  Q  and  J  G 
Int  for  identifying  measures.  By  Theorem  14,  (/,  p)  J  fdp  is  {8^,6^,  p)- 
computable  for  continuous  functions.  In  the  following  we  indentify  measures  p 
by  atomic  properties  r  <  f  tdp  or  r  <  J  tdp.  <  5,  where  77  5  G  Q  and  t  is  from  a 
set  of  simple  continuous  “test  functions” . 

Definition  16.  For  n  G  w  and  0  <  m  <  2”  define  the  triangle  function  tnm  G 
^[0;  1]  by 

(x-{m~  1)2-’^  if  (777  -  1)2-^  <  <  m  •  2-” 

tnmix)  (777  +  1)2-"  -  a:  if  772  •  2“"  <  x  <  (ttz  +  1)  •  2"" 

I  0  otherwise. 
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Let  6I,,  and  6;'  be  the  standard  representartion  of  M  induced  by  the  informa¬ 
tion  structures  and  (M,cr'V'),  respectively,  defined  as  follows:  ^  G 

)  ^  ^  ^  J  fi  e  :  <=>  u  <  Jtnmdfi  <  V 

for  all  f-i  eM,  n  eu),  0  <  m.  <  T-  and  u,v  ^  domivQ), 

We  have  not  yet  shown,  that  the  systems  <r'  and  a"  from  Definition  16  identify 
points,  i.e.  <5^^  and  may  still  be  representations  of  partitions  of  M  which  are 
coarser  than  {{/^}  |  /i  G  M). 

Theorem  17.  and  6^  are  representations  ofM.  such  that  8m  =  b'^ 

By  definition,  the  weak  topology  on  the  set  M  of  probability  measures  on 
([0;1],B)  is  the  coarsest,  i.e.  smallest,  topology  r,  such  that  f  fdp,  is 

(r,  riR)-continuous  for  every  /  G  C[0;  1]  [Bau74].  As  a  corollary  of  Theorem 
17  we  obtain: 

Corollary  18.  The  weak  topology  is  the  final  topology  Tm  of  the  representa¬ 
tion  bm- 

The  weak  topology  on  ([0;  1],B)  can  be  generated  by  a  metric  [Bau74]. 

Definition  19.  {Hutchinson  metric)  Let  Lip  :=  {/  G  C[0;  1]  |  f{x)  =  0  and 
V.r,i/-|/(.r)  -  f{y)\  <  k  -  j/|}.  Define  d^  :  M  x  M  — ^  IR  by  d^{p,p')  := 
sup{\J  fdfi  -  f  fdfi'Wf  e  Lip}, 

The  metric  d^  is  called  the  Hutchinson  metric  [Hut81,  Bar93]. 

Lemma  20.  d^  is  a  metric  M. 

Theorem21.  :  M  x  M  — ^  IR  is  {bm,^m:  p) -computable. 

By  Lemma  2.1  from  [Wei93],  the  metric  space  (M,o?^)  has  a  countable  den.se 
subset.  By  Corollary  45.4  from  [Bau74],  the  discrete  measures  are  dense.  We  shall 
use  the  discrete  measures  determined  by  rational  numbers  as  a  dense  subset.  Let 
Ma  be  the  set  of  all  probability  measures  ^  G  M  such  that  there  are  a  finite  set 
K  and  rational  numbers  r^^Sk  G  [0;  1]  for  all  k  G  K  such  that  Y{^{sk  |  A:  G  A"}  =  1 
and  p  —  Yl  5  where  px{A)  (1  if  G  A,  0  otherwise).  Let  Ud  be  a  standard 
notation  of  Me-  A  computable  metric  space  is  a  quadruple  (M,  d,  A,z/)  such 
that  (M,  d)  is  a  metric  space,  A  is  a  dense  countable  subset  and  z/  is  a  notation 
jy  :C  U*  — ^  A  of  A  such  that  the  set  {{u,v,w,x)  |  u  <  d{i/{v),i/{w))  <  is 
r.e.  [Wei93].  This  definition  is  somewhat  stronger  than  that  in  [Wei87].  For  a 
computable  metric  space  (M,  d,  A,z/),  the  Cauchy  rerpesentation  be  [Wei97]  is 
defined  as  follows  (we  assume  w.l.o.g.  dom{i')  C  {U  \  {J})*  )  :  bcip)  =  x  :  4=^ 
p  =  uottuill .  .  .  such  that  Vz  >  k  d{iy{ui),  iy{uk))  <  2"^  and  x  =  .lim  iy{ui). 

Theorem22.  (J)  Vd  <  bm.  (2)  (M,  d^,  M^,  is  a  computable  metric  space. 
(3)  The  Cauchy  representation  b^  for  this  space  is  equivalent  to  bm- 
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Since  6m  =  =  6^^ ,  these  four  representations  of  the  probability  mea¬ 

sures  M  on  the  space  ([0;  1],B)  induce  the  same  computability  theory  and  in 
particular  have  the  same  final  topology,  which  is  the  topology  r  generated  by  the 
Hutchinson-metric.  As  a  consequence,  for  a  hyperbolic  [HutSl]  computable  IPS 
with  probabilities  as  in  Corollary  12  the  unique  invariant  measure  is  computable 
w.r.t.  any  of  these  representations.  For  a  domain-theoretic  approach  see  [Eda96]. 

Theorem  23.  Let  S  =  ([0;  1], /i, . . . , /fc,  pi , . . . ,  P/t)  he  a  hyperbolic  IFS  unih 
probabiliiies  such  that  /i, . . . ,  A  are  6^  -computable  and  pi, . .  -  .pk  are  p-computable. 
Then  the  unique  fixed  point  pg  of  the  operator  Tg  :  M  — »■  M  defined  by 
k 

rg(p)(A)  :=  Sm-computable. 

In  measure  theory  not  only  probability  measures  but  arbitrary  measures  p.  : 
B  — >•  m  U  {oo)  are  studied.  Let  be  the  set  of  all  measures  p  :  B  — IR, 
i.e.  all  bounded  measures  on  ([0;1],B).  Let  6^  be  the  representation  of 
obtained  from  Definition  1,  where  M  is  replaced  by  M^.  While  ^^„(p)[0;  1]  =  1, 
^<(p)[0;  1]  may  be  any  non-negative  real  number.  An  easy  proof  shows  that  p 
p[0;  1]  is  only  ,  p<)-computable  and  not  {6^ ,  p)-continuous.  This  means,  that 
informations  about  upper  bounds  of  (5'^(p)[0;  1]  are  not  available  from  prefixes  of 
p.  As  a  consequence,  Theorem  14  on  integration  fails  for  6"^ .  Only  the  following 
weak  version  can  be  proved:  (/,  p)  ^  /  fdp  for  non-negative  /  G  C[0;  1]  and 
p  G  is  ,  (5^ ,  p<)-computable.  We  can,  however,  include  informations 
about  upper  bounds  of  p[0;  1]  in  the  names.  Let  6^  be  the  representation  of 
defined  by  the  following  notation  u  of  atomic  pieces  of  information:  p  G 
i/{uLv6w)  u  <  p(0  and  p[0;  1]  <  il).  Then  the  theorems  we  have  proved 

for  Srn  hold  accordingly  for  6^,  in  particular  Theorem  14  on  integration.  The 
connection  to  Sm.  is  given  by  the  following  lemma. 

Lemma  24.  The  function  p  ^  p[0;  1]  on  is  {8^ ,  p)-computable,  and  the 
function  p  p/p[0;  1]  is  {6^ ,6m) -computable  for  p  G  M^p[0;  1]  /  0. 

5  Conclusion 

In  this  paper  we  have  introduced  and  discussed  a  very  natural  and  canonical  com¬ 
putability  theory  on  the  set  M  of  probability  measures  on  the  Borel  subsets  of 
the  unit  interval  [0;  1].  In  particular,  we  have  shown  that  simple  obvious  require¬ 
ments  exclude  a  number  of  similar  definitions,  that  the  definition  leads  to  the 
expected  computability  results,  that  there  are  other  natural  definitions  induc¬ 
ing  the  same  computability  theory  and  that  the  theory  is  embedded  smoothly 
into  classical  measure  theory.  Although  we  have  only  stated  the  existence  of 
computable  functions  throughout  the  paper,  all  the  proofs  provide  algorithms, 
which  can  be  realized  by  programs  from  some  common  programming  language 
like  PASCAL  or  C.  Of  course  the  basic  definitions  and  many  results  can  be 
transferred  from  the  space  M  to  more  general  spaces  of  measures. 
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Abstract.  Up  to  know,  the  known  derandomization  methods  have  been 
derived  assuming  average-case  hardness  conditions.  In  this  paper  we  in¬ 
stead  present  the  first  worst-case  hardness  conditions  sufficient  to  obtain 
P  =  BPP. 

Our  conditions  refer  to  the  worst-case  circuit  complexity  of  Boolean  op¬ 
erators  computable  in  time  exponential  in  the  input  size.  Such  results 
are  achieved  by  a  new  method  that  departs  significantly  from  the  usual 
known  methods  based  on  pseudo-random  generators. 

Our  method  also  gives  a  worst-case  hardness  condition  for  the  circuit 
complexity  of  Boolean  operators  computable  in  NC  (with  respect  to  their 
output  size)  to  obtain  NC  =  BPNC. 


1  Introduction 

1.1  Motivations  and  previous  results.  A  major  goal  in  complexity  the¬ 
ory  is  the  study  of  the  real  power  of  randomized  algorithms,  that  is  algorithms 
that  make  decisions  based  on  the  output  of  a  random  source  of  bits.  To  this  aim, 
several  recent  works  have  been  focused  on  the  design  of  general  methods  that  de¬ 
crease  (or  remove)  the  amount  of  random  bits  used  by  these  algorithms.  A  central 
question  in  this  area  is  the  relationship  between  the  existence  of  computationally- 
hard  functions  and  the  existence  of  efficient  derandomization  methods.  Yao  [12], 
and  Blum  and  Micali  [5]  introduced  the  concept  of  Pseudo-Random  Genera¬ 
tor  (PSRG),  any  Boolean  operator  G  —  [Gn  :  {0,1}*^”^  ^  >  O}? 

(denoted  by  G  :  k{n)  -)■  n)  that,  for  a.e.  n  and  for  any  Boolean  function 
/  ;  {0,1}’^  {Ojl}  whose  circuit  complexity  L{f)  is  at  most  n,  satisfies: 
|Pr  (/(y)  -  1)-  Pr  (/(Gn(x))  —  1)|  <  1/n  (where  y  is  chosen  uniformly  at  ran¬ 
dom  from  {0,1}”,  and  x  from  {0,1}^^”^).  The  output  sets  of  PSRG  are  also 
called  discrepancy  sets  for  circuits  of  linear  size. 

According  to  the  definition  used  in  [10],  a  Boolean  operator  Op  :  k{n)  n 
is  quick  if  it  can  be  computed  in  time  polynomial  in  n  (note  in  passing  that 
if  k{n)  =  O(logn)  then  the  “quick”  condition  is  equivalent  to  assume  that  Op 
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belongs  to  EXP).  It  is  not  hard  to  show  [10]  that  the  existence  of  a  quick  PSRG 
G  :  k{n)  ^  n  with  k{n)  =  O(logn)  implies  P  =  BPP.  Nisan  and  Wigderson  [10] 
showed  a  method  to  construct  quick  PSRG  based  on  the  existence  of  Boolean 
functions  in  EXP  that  have  exponential  hardness  [10].  The  hardness  condition 
used  by  Nisan  and  Wigderson  requires  the  existence  of  a  function  in  EXP  that 
not  only  has  a  hard  worst-case  circuit  complexity^  but  also  a  hard  average-case 
circuit  complexity.  More  formally,  a  function  /  :  {0,1}”^  — >■  {0,1}  is  {e,L)-hard 
if,  for  any  circuit  C  of  size  at  most  L,  |Pr  (G(x)  =  /(x))  —  1/2|  <  e/2.  Given 
a  Boolean  function  F  —  {Fn  ‘  {0,1}”  {0,1},  n  >  0},  the  hardness  at  n  of 

F  (denoted  as  Hpin))  is  defined  as  the  maximum  integer  hn  such  that  Fn  is 
{l/hn,  /i„)-hard.  Then,  F  has  exponential  hardness  if  Hpin)  >  2^^^\  Nisan  and 
Wigderson  showed  a  fundamental  “Hardness  vs  Randomness”  result. 

Theorem  1.  [10]  If  a  Boolean  function  F  exists  such  that  i)  F  e  EXP,  and 
ii)  F  has  exponential  hardness,  then  there  exists  a  quick  PSRG  G  :  k{n)  — )■  n 
where  k{n)  —  O(logn),  and  consequently  P  =  BPP. 

The  hardness  required  by  Nisan  and  Wigderson’s  construction  of  quick  PSRG 
thus  refers  to  average-case  complexity.  Then  a  consequent  and  natural  question 
is  the  following:  Does  any  “worst-case”  hardness  assumption  on  the  circuit  com¬ 
plexity  of  Boolean  functions  computable  in  time  exponential  in  the  input  size 
exist  which  allows  to  derive  an  efficient  derandomization  method  (in  particular, 
to  obtain  P  =  BPP)1 

We  give  two  answers  to  this  question.  Both  answers  make  use  of  a  new  method 
(informally  described  in  Section  1.3)  that  relies  on  a  particular  class  of  Boolean 
operators  (different  from  PSRG),  denoted  as  Hitting  Set  Generators,  which  have 
been  recently  introduced  in  [3].  Let  L(f)  denote  the  circuit  complexity  of  a  finite 
function  /  :  {0,1}”  {0,1}  and,  given  any  positive  number  dp,  the  term  Ldp{f) 

denotes  the  minimum  size  of  circuits  of  depth  dp  which  are  able  to  compute  /. 

Definition  2.  Let  e(n),  I3{n),  and  7(n)  be  polynomial-time  computable  func¬ 
tions  such  that  for  any  n  >  1:  0  <  e{n)  <  1,  n  <  p{n)  <  2”’,  and  7(n)  >  logn. 
Then,  a  Boolean  operator  H  :  k{n)  ^  n  is  an  {e{n), [3 {n),^{n))- Hitting  Set  Gen¬ 
erator  (in  short,  (e{n),  (3{n),^(n))-HSG)  if,  for  any  Boolean  function  /  such  that 
R-y{n){f)  ^  and  Pr  (/  =  !)>  e{n),  H  is  required  to  provide  one  “example” 

y  for  which  /(y)  =  1,  i.e.,  there  exists  a  6  {0,1}^^”^  such  that  f{Hn{sL))  —  1. 
When  no  depth  constraint  j{n)  is  imposed,  we  will  use  notation  (e{n),P{n))- 
HSG. 

By  making  a  simple  comparison  between  the  definition  of  discrepancy  sets 
and  that  of  hitting  sets  it  should  be  clear  that  HSG  satisfy  a  property  signifi¬ 
cantly  weaker  than  that  of  PSRG.  Nevertheless,  Andreev  et  al  [3]  proved  that, 
given  any  RPP-algorithm  A,  the  output  of  any  quick  HSG  can  be  transformed 
into  an  ad  hoc  discrepancy  set  for  A  by  means  of  a  deterministic  polynomial-time 
algorithm. 

^  As  circuit  complexity  of  a  finite  Boolean  function  /,  we  will  always  mean  the  size  of 
the  smallest  circuit  that  computes  /. 
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Theorems.  [3]  Let  k{n)  =  O(logn)  and  let  e  be  any  constant  such  that  0  < 
e  <  1.  //  there  exists  a  quick  (e,n)~HSG  H  :  k{n)  n  then  P  —  BP P. 

As  we  will  describe  in  Section  1.3,  the  polynomial-time  algorithm  in  [3]  is 
of  independent  interest  and  it  is  used  in  this  paper  to  obtain  Theorem  5.  On 
the  other  hand,  more  recently  (after  the  submission  of  our  paper),  a  different 
algorithmic  proof  of  Theorem  3  has  been  given  in  [4].  This  algorithm  is  simpler 
and  runs  in  NC^ . 

1.2  Our  results.  We  give  two  worst-case  hardness  conditions  which  are  suffi¬ 
cient  to  construct  quick  HSG  that  satisfy  Theorem  3  thus  obtaining  P  =  BPP. 
The  circuit  complexity  of  a  Boolean  operator  H  will  be  denoted  as 
Observe  that  if  L^^{k,n)  denotes  the  worst-case  circuit  complexity  of  Boolean 
operators  H  :  k{n)  n,  then  it  is  known  [9,  11]  that,  for  any  logn  <  /c  <  n, 
L^P[k,n)  =  (1  +  o(l))(2^n)/(A:  +  logn).  Furthermore,  for  a.e.  Boolean  opera¬ 
tor  i/  :  A:  ->  n,  we  have  =  6)((2^n)/(/c  -h  logn)).  The  first  condition 

deals  with  the  worst-case  circuit-complexity  of  characteristic  functions  of  sets 
generated  by  Boolean  operators. 

Theorem 4.  Let  5  he  such  that  0  <  S  <  1/2,  and  let  k(n)  —  (1  -f-  0(1))  logn.  If 
there  exists  a  quick  operator  H  :  k(n)  ^  n  such  that  the  characteristic  function 
of  its  output  sets  —  {F^  :  {0,1}”  ->  {0,1}  ,  where  F^  {x)  —  1  3  y  G 

{0, 1}^^”^  s.t.  Hn{y)  =  X,  n  >  0}  satisfies 

L{F^)  >  (1/2 -H(5)(2"(")n)/(fc(n)  + logn), 

then  it  is  possible  to  construct  a  quick  operator  H'  :  /c'(n)  n  where  k'{n)  - 
0(logn)  such  that  H'  is  an  {e,n)-HSG  for  some  constant  0  <  e  <  1,  thus 
P  =  BPP. 

Another  way  to  state  the  above  theorem  is  the  following.  Assume  that  there 
exists  a  sparse  language  S  =  {S^  C  {0,1}”,  n  >  0}  that  can  be  generated  by 
an  uniform  algorithm  which  runs  in  time  polynomial  in  n,  and  such  that  the 
worst-case  circuit  complexity  of  deciding  S  is  not  smaller  (up  to  some  constant 
factor)  than  the  worst-case  circuit  complexity  of  generating  languages  S'  having 
the  same  sparsity  factor  of  S.  Then  P  =  BPP. 

The  second  sufficient  condition  to  obtain  a  quick  HSG  refers  directly  to  the 
worst-case  circuit  complexity  of  Boolean  operators  instead  of  the  characteristic 
functions  of  their  output  sets. 

Theorems.  Let  k{n)  =  0(logn).  Let  H  :  k{n)  n  be  a  quick  operator  such 
that  for  a.e.  n, 

L°P(Hn)  >  L‘’^{Kn)  -  (2*W)/(fc(n)^). 

Then,  for  any  constant  0  <  e  <  1,  and  for  any  positive  integer  q,  it  is  possible  to 
construct  a  quick  (1  —  e,n^)-HSG  H'  :  k'{n)  — )•  n,  where  k'{n)  ~  0(logn),  thus 
obtaining  P  =  BPP. 
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Furthermore,  using  the  new  “parallel”  proof  of  Theorem  3,  we  provide  here 
a  worst-case  hardness  condition  for  Boolean  operators  sufficient  to  derandomize 
any  BPNC  algorithm  (i.e.  to  obtain  BPNC  —  NC), 

Theorem  6.  A  constants  <  Co  <  1  exists  such  that  if  an  operator  H  :  k{n)  n 
with  k{n)  =  O(logn)  exists  such  that  1)  H  is  an  NC  operator^,  and  2)  for  any 
d>  I  there  exists  a  constant  c  with  0  <  Cq  <  c  <  1  such  that  the  characteristic 
function  of  its  output  sets  satisfies  TiQgd„(F^)  >  c{2^^^^n)/{k{n)  +  logn), 
then  NC  =  BPNC. 


1.3  Our  method  and  further  connections  with  other  works.  All  of  our 

proofs  share  a  common  method  based  on  the  following  fact.  There  is  a  precise 
trade-off  between  the  worst-case  circuit  complexity  of  partial  Boolean  functions 
and  the  number  of  I’s  in  their  outputs.  In  particular,  we  formalize  the  intuitive 
fact  that  a  partial  Boolean  function  having  a  hard  worst-case  circuit  complexity 
cannot  return  0  for  a  “large”  number  of  inputs.  This  property  is  used  to  construct 
the  preliminary  versions  of  our  HSG  which  are  then  combined  with  a  convenient 
use  of  the  properties  of  expanders  graphs  [2]  (to  obtain  Theorem  4)  and  with 
a  new  analysis  of  the  performances  of  the  already  mentioned  Andreev  et  a/’s 
algorithm  [3]  (to  obtain  Theorem  5). 

Finally,  we  remark  that  hardness  vs  randomness  results  similar  to  those  ob¬ 
tained  in  our  paper  have  been  obtained,  independently  from  our  work,  by  Im- 
pagliazzo  and  Wigderson  in  [6].  Their  method  (based  on  the  derandomization 
of  the  XOR-lemma)  achieves  a  trade-off  which  is  stronger  than  ours  in  the  case 
of  sequential  algorithms  (i.e.  BPP  algorithms).  However  it  is  not  clear,  to  our 
present  knowledge,  whether  their  method  can  be  applied  to  obtain  trade-offs 
for  parallel  computation  (like  ours)  since  they  use,  in  a  rather  envolved  way, 
expander  walks  which  seem  to  be  hard  to  parallelize. 

Due  to  the  lack  of  space,  proofs  will  be  given  in  the  full  version  of  this  paper. 


2  Preliminary  results  on  the  circuit  complexity  of  partial 
Boolean  functions 

Let  F{n,  N,  m)  be  the  set  of  all  partial  Boolean  functions  f{xi, . . . ,  Xn)  defined 
on  A"  <  2^  inputs  and  assuming  1  on  m  <  A  inputs.  Furthermore,  L{n,N,m) 
denotes  the  worst-case  circuit  complexity  of  functions  from  P{n,  N,m),  and 
Ldepth(n,N,m)  denotes  the  maximum  value  Ldepthif)  among  all  functions  / 
from  F{n,N,m).  Lupanov  [9]  obtained  the  asymptotical  bounds  result  for  the 
case  of  total  Boolean  functions. 

However,  in  order  to  construct  quick  HSG  we  need  that  Lupanov’s  results 
hold  also  for  partial  Boolean  functions.  In  particular,  the  generalization  of  the 
upper  bounds  cannot  be  derived  directly  from  the  proofs  in  [9].  Then  we  give  a 


^  With  “NC  operator”,  we  will  always  mean  an  operator  which  is  computable  in  NC 
with  respect  to  the  size  of  its  output 
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reduction  from  general  Boolean  functions  to  the  restricted  case  of  total  Boolean 
functions  which  is  based  on  a  probabilistic  construction  of  suitable  linear  oper¬ 
ators. 

Theorem  7. 

L{n,N,m)  =  (1  +  o(l))  (|log  )/ (log log  j  +  0(n)  . 

Furthermore  a  constant  c  >  0  exists  such  that 

ici06n(n,iV,m)  =  (1  +  0(1))  (log  /  (loglog  +0(n)  . 


3  Hard  characteristic  functions  and  HSG 


The  following  theorem  provides  a  first  trade-offs  between  the  hardness  of  char¬ 
acteristic  functions  of  Boolean  subsets  and  their  hitting  properties®. 


Theorems.  Let  0  <  C2  <  1  a  constant  [and  d  >  1],  and  let  Sn  Q  {0,1}’^ 
he  any  subset  such  that  15„|  <  bn,  where  bn  =  Suppose  that  for  the 

characteristic  function  Fn  of  Sn  we  have 


i)  L{Fn)  >  >  ^2jogfe,+logn 


Then,  for  any  constant  ci,  such  that  0  <  ci  <  C2,  for  any  Boolean  function 
f{xi,...,  Xn)  such  that 


a)  Pr  (/  =  1)  >  1  -  2^""'  and  Hi)  L{f)  <  bn  [  Hi’)  n(/)  ^  ], 


there  exists  a  £  Sn  for  which  /(a)  =  1. 


Sketch  of  the  proof  Suppose,  by  contradiction,  that  /  satisfies  conditions  ii)  and 
Hi)  but  for  any  a  G  5n  we  have  /(a)  =  0.  Let  Z  C  {0,1}”  be  the  subset  of 
all  inputs  on  which  /  =  0.  Clearly,  we  have  Sn  Q  Z  C  {0,1}”.  Then  consider 
the  partial  Boolean  function  g{xi , . . . ,  Xn)  defined  as  follows:  g{a)  =  1  if  a  E  S'n, 
g{a)  -  0  if  a  e  ^\5n,  and  g(a.)  is  not  defined  if  a  e  {0, 1}”\Z.  Since  \Z\  <  2^^^ 
and  \Sn\  <  bn,  from  Theorem  7,  we  have 

L{9)  <  (1+‘'(1))  ('Og(\^ 

<  (l  +  o(l))ci(;)„n)/(log6n  +  logra)  . 

From  Sn  C  Z,  it  is  easy  to  prove  that,  given  any  a,  Fn(a)  can  be  computed  as 
c/(a)  A  “i/(a).  Hence 


®  Each  result  will  be  given  in  both  sequential  and  “parallel”  version.  The  latter  will 
be  included  in  square  brackets. 
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L{Fn)<L{g)  +  L(f)  +  0{l)  <  (1  +  +  0(1)  < 


<  (l  +  o(l))ci 


Kn 

logbn  +  logn' 


For  sufficiently  large  n,  this  last  upper  bound  is  in  contradiction  with  hypothe¬ 
sis  (z)  of  our  theorem.  The  “parallel”  version  of  the  theorem  can  be  easily  derived 
using  the  same  contradiction  argument.  □ 


In  which  follows,  we  will  consider  HSG  which  always  have  a  monotone  func¬ 
tion  prize  k{n)  such  that,  for  any  n  >  0,  k{n+l)—k{n)  <  1  and  >  k{n)  >  logn 
where  0  <  o;  <  1.  Let  H  :  k(n)  ^  n  be  a  Boolean  operator  with  k{n)  =  0(logn), 
and  let  —  {F^  :  {0, 1}"^  ^  {0, 1},  n  >  0}  be  the  corresponding  family  of 
the  characteristic  functions. 


Corollary  9.  Suppose  that  a  quick  [NC]  operator  H  :  k{n)  n  exists  such  that 
k{n)  —  (1  +  0(1))  logn  and  a  constant  0  <  C2  <  1  exists  such  that,  for  a.e.  n, 
L(F,f)  >  C2(2M")n)/(fc(n)  +logn)  [  >  C2(2*(-)«)/{fc(n)  +  logn) 

for  some  d  >  1] .  Then,  for  any  positive  constant  q  and  for  any  constant  ci  such 
that  0  <  Cl  <  C2,  it  is  possible  to  construct  a  quick  [A^C]  operator  H'  :  F{n)  n 
with  k'{n)  —  0(logn)  and  such  that  H'  is  an  (1  —  [  H'  is  an 

(1  _  2("i-i)^,n‘?,log^n)-id56']. 


3.1  Improved  HSG  using  expanders 

Corollary  9  gives  a  quick  HSG  for  the  class  of  polynomial  size  circuits  (functions) 
C  that  have  a  very  large  fraction  of  I’s,  i.e.  Pr  (0  =  1)  >  1  —  2“^”  for  some 
positive  constant  smaller  than  1.  However,  this  hitting  property  does  not  suffice 
to  derandomize  BPP-algorithms  (see  Theorem  3).  It  is  in  fact  required  to  hit  all 
linear-size  circuits  having  “only”  a  constant  fraction  of  I's.  To  this  aim,  we  will 
combine  the  HSG  in  Corollary  9  with  a  random  walk  on  expanders,  a  tool  that 
has  been  often  used  in  decreasing  randomness  in  probabilistic  algorithms. 

An  undirected  graph  G{V,  E)  is  a  (d,  c)- expander  if  the  maximum  degree  of  a 
vertex  is  d,  and  for  every  set  TP  C  P  of  cardinality  |TP|  <  |T^|/2,  the  inequality 
|7V(TP)  —  lP|>c|TP|  holds,  where  N{W)  denotes  the  set  of  all  vertices  adjacent 
to  some  vertex  in  W.  The  expanding  properties  of  a  graph  can  be  established  by 
determining  the  value  of  its  second  largest  eigenvalue.  Indeed,  if  A  is  an  upper 
bound  on  the  second  largest  eigenvalue  of  any  d-regular  graph  G{V,E),  then 
G  is  a  (d,c)-expander  for  c  =  {d  —  A)/2d.  Expander  graphs  have  the  following 
important  “hitting”  property  proved  by  Ajtai  et  al  [1]. 

Theorem  10.  Let  G{V,E)  be  a  d-regular  graph,  and  assume  that  its  second 
largest  eigenvalue  is  at  most  A  >  0.  Given  any  subset  W  C.V  such  that  |TP|  =  an 
(a  <  1).  Then,  for  every  t  >  0,  the  number  of  walks  of  length  t  in  G  that  avoid 
W  is  at  most  n{l  —  q;)^/^((1  —  a)d^  F  A^)^/^. 
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In  [7],  a  polynomial-time  algorithm  is  presented  that,  given  n  >  0,  and  d  <  n, 
constructs  a  d'-regular  expanders  G  such  that  d'  =  0{d),  \V\  =  0{n),  and  its 
second  largest  eigenvalues  A  >  0  is  such  that  A  <  2^/d^  (such  graphs  are 
called  Ramanujan  graphs). 

For  any  n  >  0,  consider  a  d-regular  Ramanujan  expander  EPn  =  (14,-A^n) 
where  2^  <  |T4|  <  2"^+^  [7].  Observe  that  the  Boolean  strings  with  last  compo¬ 
nent  equal  0  correspond  to  the  input  set  of  the  function  we  want  to  hit.  This  as¬ 
sumption  is  required  when  EPn  cannot  be  constructed  on  vertex  sets  whose  size 
is  exactly  a  power  of  2.  Let  I  =  [log  d] .  We  suppose  that  d  is  a  large  but  constant 

value.  Then,  we  consider  the  operator  EPRn,t  '■  {0^  1}  ^ 

such  that 

EPRn,t{a,VLu.--,U2t-us)  ,  ae  {0,1}"  ,  Ui  G  {0,1}'  ,  sG  {0,1}'  , 

are  the  first  n  components  of  the  0(s)-th  vertex  of  the  EPn-walk  of  length  2^ 
which  starts  from  vertex  (a,  0)  and  is  uniquely  determined  by  the  sequence  of 
edge  choices  in  the  neighborhood  of  each  vertex:  0(ui), . . .  ,0(u2t-i).  Observe 
that  if  t  =  6)(logn),  the  operator  EPRn,t  can  be  computed  in  time  polynomial 
in  n.  Consider  now  a  Boolean  function  g(xi , . . . ,  Xn),  and  the  operator  EP Rf^  ^  : 

{0,  {0, 1}  that  performs  the  OR  among  the  values  of  g  computed  on 

the  input  points  visited  by  a  fixed  EPn-'^dXk  of  length  2^,  i.e., 


EPRi^i{3i,uu...,U2t)  =  V  p(F;Fi?n,t(a,Ui,...,U2._i,s))  .  (1) 

SG{0,1}‘ 

As  consequence  of  Theorem  10,  we  can  prove  the  following  bound. 

Lemma  11.  //Pr(5  =  0)  <c<  then  Pr  {EPRi^  =  O)  <  (c+^)^ 

Theorem  12.  Assume  that  there  exists  a  quick  operator  H  :  k{n)  n,  such 
that  k{n)  =  (1  -t-  6>(l))logn  and  the  characteristic  functions  of  its  output  sets 
satisfies 

L{Fn)  >  ((log(4A)/(logd)  +<5)(2*("'n)/(fc(n)  +  logn) 

for  some  constant  6  >  0.  Then  it  is  possible  to  construct  a  quick  operator  H"  : 
k"{n)  ->  n  with  k" {n)  =  6)(log7i)  and  such  that  H"  is  an  (1  -  e,n)-HSG  for 
some  constant  0  <  e  <  1,  thus  P  =  BPP. 


4  Hitting  Set  Generators  for  BPNC 

Ramanujan’s  graphs  cannot  be  used  to  derive  NC  Hitting  Set  Generators  since 
no  efficient  parallel  method  to  perform  random  walks  on  such  graphs  is  presently 
available.  However,  Zuckermann  [13]  recently  introduced  an  NC  construction  of 
samplers  [13]  which  can  replace  the  role  of  expanders  in  our  construction.  In 
particular,  we  can  use  the  following  result. 
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Theorem  13.  [13]  Any  BPNC  algorithm  that  uses  n  random  bits  and  has  error 
probability  bounded  by  1/3  can  be  simulated  by  a  BPNC  algorithm  that  uses 
r[n)  =  0(n)  random  bits  and  has  error  probability  bounded  by  (1/2)”. 

Informally  speaking,  this  result  allows  us  to  consider  only  “parallel”  circuits 
having  a  fraction  of  Ts  not  smaller  than  1  -  for  some  fixed  constant  0  < 
c  <  1.  By  using  the  same  method  of  Section  3.1,  we  can  combine  Corollary  9 
and  Theorem  13  to  obtain  the  following  result 

Theorem  14.  A  constant  0  <  C2  <  1  exists  such  that  the  following  holds.  As¬ 
sume  that  there  exists  an  NC  operator  H  :  k{n)  n  with  k{n)  =  (14-0(1))  logn 
and  such  that,  for  any  constant  d  >  1,  the  characteristic  functions  of  its  out¬ 
put  sets  satisfy  >  S{2^^^^n/{k{n)  4  logn),  for  some  constant 

6  >  Cz-  Then  it  is  possible  to  construct  an  NC  operator  H'  :  k'{n)  — >  n  with 
k'(n)  =  0(logn)  and  such  that  H'  is  an  [1  —  e,  n,  \og^  n)-HSG  for  any  constant 
0  <  e  <  1  and  d  >  1. 

In  the  next  corollary,  the  above  HSG  is  combined  with  the  new  “parallel” 
proof  of  Theorem  3  given  in  [4] . 

Corollary  15.  A  constant  0  <  <  1  exists  such  that  if  an  NC  operator  H  : 

k{n)  4-  n  exists  that  satisfies  the  same  conditions  of  Theorem  If  then  NC  = 
BPNC. 

Note.  In  the  previous  version  of  this  paper  (when  the  new  proof  of  Theorem  3 
was  still  unknown)  we  were  able  to  provide  only  sufficient  hardness  conditons 
to  obtain  ZNC  =  BPNC.  The  proof  of  this  weaker  result  is  of  independent 
interest  and  has  been  used  in  [4]  to  obtain  some  results  in  the  context  of  weak 
random  sources.  A  new  version  of  this  proof  can  be  found  in  [4]. 


5  Hitting  sets  from  hard  Boolean  operators 

The  construction  of  an  efficient  HSG  from  a  Boolean  operator  which  has  hard 
circuit-complexity  is  based  on  the  following  “contradiction”  argument.  Suppose 
that  a  Boolean  operator  T  :  {0, 1}”^  — >  {0, 1}”  is  not  a  HSG  for  a  certain  class 
of  circuits  defined  by  the  parameters  e{n)  and  /l(n)  (see  Def.  2).  Roughly  speak¬ 
ing,  this  negative  fact  implies  that  the  output  sequence  of  T  can  be  represented 
by  a  new  binary  sequence  which  contains  a  “large”  number  of  O’s  (this  number 
depends  on  e(n)  and  /3{n)).  Then,  using  Andreev  et  a/’s  technique  shown  in  [3], 
it  is  possible  to  compress  this  new  binary  sequence  in  order  to  prove  an  upper 
bound  on  the  circuit  complexity  of  T.  This  bound  is  obtained  by  a  new  analysis 
of  the  compression  rate  achieved  by  this  technique  and  by  applying  the  upper 
bound  for  the  Shannon  function  L{n,N,m)  in  Theorem  7.  If  T  is  supposed  to 
have  a  hard  circuit  complexity,  we  get  a  contradiction. 


185 


5.1  Compressing  Boolean  operators 

Let  T  :  {0, 1}""  {0, 1}""  and  C{xi , . . . ,  Xn)  be  a  circuit  with  n  inputs.  Given 

a  G  {0, 1)^,  consider  the  function  Med{f,T,a)  =  2“”^  Z1ug{o,i}’"  ® 

(as  in  the  proof  of  Corollary  15),  It  is  easy  to  prove  that  E  (Med(C,  T,  a))  = 
Pr  ((7(2;i  , . . . ,  Xn)  =  1)  where  the  expected  value  is  computed  with  respect  to  a. 
We  briefly  describe  here  the  Andreev  et  aFs  technique  introduced  in  [3].  Let  ai 
and  02  be  two  different  elements  in  {0, 1}^.  Define  di  =  Med(C,  T,  Oi)  and  ^2  = 
Med{C,  T,  02)  and  assume  that  D  =  d2-di  >  0.  The  j-ih  component  of  a  will  be 
denoted  as  [a]-^  .  Since  we  are  considering  the  case  in  which  D  >  0,  we  can  assume 
that  there  exists  an  index  s  for  which  [aiY  ^  [<^2]^  Consider  the  operator 
T*  :  {0, 1}""  ^  {0, 1}""  defined  as  follows  T^(u)  ==:  r(u)  ©  ([T(u)]^  •  (oi  0  02)) 
where  the  operation  is  the  standard  scalar  product.  The  s-th  component  of 
r^(u)  satisfies  the  following  equations: 


[T*{u)Y  =  [r(u)r0([T(u)]^-([aire[a2]O)  =  [r(u)]^  ©  [T(u)]^  •  1  =  0  .  (2) 

Observe  also  that  the  set  {T#(u)  ©  ai,T#(u)  ©  02}  is  equal  to  the  set 
{r(u)  ©  Oi  ,  r(u)  ©  02}.  Let 


iV((j,0i,02)  ^  l{u  :  [T(u)]"  =  (7,  C'(r(u)©ai)  <1)1  and  C(T(u)©a2)  =  h}\  • 

(^) 

We  can  now  introduce  the  function  which  approximates  the  s-th  component 
of  T(u).  Consider  the  function  Q  defined  as  follows: 


Qa^((7,0i  ,02) 


'  X  if  ^  7^  2/ 

lifx  =  y  =  0  and  A^(l,  0, 0)  >  A^{0, 0, 0) 

i  0  if  X  =:  y  —  0  and  A^(l,  0, 0)  <  A/'(0, 0,  0) 

li{x  =  y  =  l  and  A^(l,  1, 1)  >  N(0, 1, 1) 

^Oifx-y^l  and  1, 1)  >  A^(0, 1, 1) 


In  which  follows  we  will  consider  the  function  as  a  fixed  parameter,  and 
thus  we  will  omit  the  index  N {a ^  </>i,  <^>2)  in  the  definition  of  Q.  Then  the  approxi¬ 
mation  function  for  the  s-th  bit  of  r(u)  is  Z{u)  =  Q(C{T^(u)^0!.i),C{T'^{u)® 
02)),  z  =  1, . . . ,  m.  Our  next  goal  is  to  estimate  the  number  of  errors  generated 
by  ^(u).  Let  (a,  0i ,  (562 )  be  the  number  of  inputs  u  such  that  the  follow¬ 
ing  conditions  are  satisfied:  i)  [r(u)]^  ©  ^(u)  =  1  (i.e.  there  is  an  error);  ii) 
[T(u)]^  =  <7;  in)  C(T{u)  ©  ai)  =  <1)1]  iv)  C(T{vl)  ©  CX2)  =  (t)2- 

The  following  Lemma  gives  an  upper  bound  on  the  number  of  errors  in  ap¬ 
proximating  the  s-th  bit  of  T{u). 

Lemmaie.  [S]  E(<T,^.i,fc)g{o,i}"  <^'2)  <  (2  “  ' 
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Some  new  hardness-compression  trade-offs  Using  Lemma  16,  we  are  now 
able  to  give  an  useful  bound  on  the  circuit  complexity  of  T.  Observe  that  function 
U(u)  =  [7"(u)]®  0  Z(u)  with  u  e  {0, 1}”^,  singles  out  the  positions  in  T  for  which 
an  error  occurs. 

Lemma  17.  L{T)  <  L^P{m,n  -  1)  +  L{U)  +  0{L{C)) 0{n)  . 

Lemma  18.  If  for  some  constant  Ci  we  have  that  D  >  Ci,  then  there  exists  a 
constant  C2  <  1  such  that  L{U)  <  C2(2”^/m). 


5.2  The  Hitting  Set  Generator 

In  order  to  derive  our  HSG,  we  will  make  use  of  the  following  result  given  by 
Lupanov  (see  also  [11]).  Let  denote  the  worst-case  circuit  complexity 

of  Boolean  operators  having  k  variables  and  n  outputs.  Then  L^^{k^n)  =  (1  + 
o{l)){2^n)l(k  0  logn). 

Theorem  19.  Assume  that  a  quick  operator  H  :  k{n)  — )•  n  exists  such  that 
A;(n)  =  (1 +  0(1))  logn,  and  for  a.e.  n  >  L{k{n),n) 

Then,  it  is  possible  to  construct  a  {1/2,  n) -HSG  H'  :  H{n)  — >  n  such  that 
k'{n)  =  0(logn).  Hence,  P  ~  BPP. 

Acknowledgements.  We  are  grateful  to  Luca  Trevisan  for  several  interesting  discus¬ 
sions. 
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Abstract.  We  construct  an  oracle  relative  to  which  NP  has  p-measure 
0  but  has  measure  1  in  EXP.  This  gives  a  strong  relativized  negative 
answer  to  a  question  posed  by  Lutz  [Lut96].  Secondly,  we  give  strong 
evidence  that  BPP  is  small.  We  show  that  BPP  has  p-measure  0  unless 
EXP  =  MA  and  thus  the  polynomial-time  hierarchy  collapses.  This  con¬ 
trasts  with  the  work  of  Regan  et.  al.  [RSC95],  where  it  is  shown  that 
V/poly  does  not  have  p-measure  0  if  exponentially  strong  pseudorandom 
generators  exist. 


1  Introduction 

Since  the  introduction  of  resource-bounded  measure  by  Lutz  [Lut92],  many  re¬ 
searchers  investigated  the  size  (measure)  of  complexity  classes  in  exponential 
time  (EXP).  A  particular  point  of  interest  is  the  hypothesis  that  NP  does  not 
have  p-measure  0.  Recent  results  have  shown  that  many  reasonable  conjectures 
in  computational  complexity  theory  follow  from  the  hypothesis  that  NP  is  not 
small  (i.e.,  /ip(NP)  0),  and  hence  it  seems  to  be  a  plausible  scientific  hypoth¬ 
esis  [LM96,  Lut96]. 

In  [Lut96],  Lutz  shows  that  if  Pp(NP)  ^  0  then  BPP  is  low  for  .  He  shows 
that  this  even  follows  from  the  seemingly  weaker  hypothesis  that  /ip(Z\2*)  0. 

He  asks  whether  the  latter  assumption  is  weaker  or  equivalent  to  pp(NP)  ^  0. 
In  this  paper  we  show  that,  relative  to  some  oracle,  the  two  assumptions  are  not 
equivalent. 

We  show  a  relativized  world  where  =  EXP  whereas  NP  has  no  P-bi- 
immune  sets.  This  immediately  implies,  via  a  result  of  Mayordomo  [May 94a], 
that  in  this  relativized  world,  NP  has  p-measure  0  and  D^,  and  hence  A2  ,  has 
measure  1  in  EXP,  and  thus  does  not  have  p-measure  0,  or  even  p2 -measure  0. 

*  URL:  http://www.cwi.nl/cwi/people/Harry.Buhrman.html.  E-mail: 

buhrman@cwi.nl.  Partially  supported  by  the  Dutch  foundation  for  scientific  research 
(NWO)  by  SION  project  612-34-002,  and  by  the  European  Union  through  Neuro- 
COLT  ESPRIT  Working  Group  Nr.  8556,  and  HC&M  grant  nr.  ERB4050PL93-0516. 

**  URL:  http://www.cs.usm.maine.edu/~fenner/.  Email:  fenner@cs.usm.maine.edu. 
Partially  supported  by  NSF  grant  CCR  92-09833. 

***  URL:  http://www.cs.uchicago.edu/~fortnow.  Email:  fortnow@cs.uchicago.edu.  Sup¬ 
ported  in  part  by  NSF  grant  CCR  92-53582,  the  Dutch  Foundation  for  Scientific 
Research  (NWO)  and  a  Fulbright  Scholar  award. 


189 


This  shows  in  a  very  strong  way  that  relativized  measure  for  NP  and 
differ:  fXp(NF)  =  0  whereas  +  0.  Here  is  the  class  of  sets 

recognized  by  polynomial  time  Turing  machines  that  are  allowed  two  queries  to 
an  NP  oracle.  We  show  that  our  results  cannot  be  improved  to  P^Pid. 

Secondly,  we  investigate  the  possibility  that  BPP  does  not  have  p-measure 
0.  Intuitively  BPP  is  a  feasible  complexity  class  close  to  P  and  therefore  it 
should  be  the  case  that  BPP  is  small.  We  give  very  strong  evidence  supporting 
this  intuition.  We  show  that  /ip (BPP)  =  0  unless  EXP  ==  MA  and  thus  the 
polynomial-time  hierarchy  collapses. 

Since  BPP  C  F/poly  our  result  contrasts  with  the  one  by  Regan,  Sivakumar 
and  Cai  [RSC95],  where  it  is  shown  that  pp{F/poly)  /  0,  unless  exponentially 
strong  pseudorandom  generators  do  not  exist. 

2  Preliminaries 

We  let  S  =  {0,1}  and  identify  strings  in  U*  with  natural  numbers  via  the 
usual  binary  representation.  We  fix  Ni,  A2, ...  to  be  a  standard  enumeration  of 
all  nondeterministic  polynomial- time  oracle  Turing  machines  (NOTMs),  where 
for  each  i  and  input  of  length  n,  Ni  runs  in  time  n*  for  all  oracles.  All  our 
machines  run  using  symbols  0,  1  and  blanks.  Fix  a  deterministic  oracle  TM  M 
which  accepts  some  standard  <^-complete  language  for  EXP^  for  all  ACS*. 
We  may  assume  that  M  runs  in  time  2^.  We  let  (•,  •)  be  the  standard  pairing 
function,  and  we  note  that  x,y  <  {x,y)  for  all  a;,  ?/  G  S*.  A  set  is  in  if  it  can 
be  expressed  as  the  difference  of  two  sets  in  NP. 

The  notations  7^,  Q,  7^+  and  Q+  denote  the  real  numbers,  the  rational  num¬ 
bers,  the  positive  real  numbers  and  the  positive  rational  numbers  respectively. 

2.1  Resource  Bounded  Measure 

Classical  Lebesque  measure  is  an  unusable  tool  in  complexity  classes.  As  these 
classes  are  all  countable,  everything  we  define  in  such  a  class  has  measure  0.  Yet, 
we  might  wish  to  have  a  notion  of  “abundance”  and  “randomness”  in  complexity 
classes.  Lutz  [Lut87,  Lut90]  introduced  the  notion  of  resource  hounded  measure, 
and  gave  a  tool  to  talk  about  these  notions  inside  complexity  classes. 

Definition!.  A  martingale  d  is  a  function  from  S*  to  77.+  with  the  property 
that  d{w0)  -f-  d{wl)  =  2d{w)  for  every  w  e  S*. 

Definition  2.  A  p-martingale  is  a  martingale  d  :  X"*  Q+  that  is  polynomial 
time  computable. 

Definitions.  A  martingale  d  succeeds  on  a  language  A  if 
lim  sup  d{xA [0  •  •  •  71  —  1])  =  +00 

nt-voo 

We  write  5°°[d]  =  (A  |  d  succeeds  on  A} 
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Definition  4.  Let  ^  be  a  class  of  languages. 

-  A''  has  p-measure  0  (/ip(A')  =  0)  iff  there  exists  a  p-martingale  d  such  that 

A'  C  _ 

-  A'  has  p-measure  1  (//p(A')  =  1)  iff  //p(A’)  =  0 

-  A*  has  p-measure  0  in  EXP  {/Xp(A'|EXP)  —  0)  iff  fl  EXP)  =  0 

-  A'  has  p-measure  1  in  EXP  (//^(A'jEXP)  —  1)  iff  /ip(A'  fl  EXP)  —  0 

One  often  defines  measure  in  EXP  using  p2 -measure  where  the  martingale 
can  use  ^  time.  All  of  our  results  also  hold  in  this  weaker  model. 


3  Measure  of  NP  versus  Measure  of 

In  this  section  we  concentrate  on  the  question  posed  by  Lutz  [Lut96].  We  show 
that  relative  to  some  oracle  ^p(NP)  =  0  does  not  imply  that  /^p(P^^)  =  0.  We 
do  this  in  a  very  strong  way  by  constructing  an  oracle  such  that  NP  does  not 
contain  P-bi-immune  sets  and  —  EXP. 

Theorems.  There  exists  an  oracle  A  such  that,  relative  to  A,  NP  has  no 
F-bi-immune  sets  and  =  EXP. 

Proof.  We  will  code  EXP  into  on  one  “side”  of  the  oracle  and  prevent  P-bi- 
immunity  on  the  other,  i.e.,  strings  in  X’*0  =  {xO  |  x  €  17*}  will  be  used  to  code 
EXP  into  DP,  while  strings  in  i7*l  =  {xl  \  x  e  S*}  will  code  the  information  to 
find  an  infinite  subset  of  each  NP  set  or  its  complement.  Some  diagonalization 
will  also  be  necessary  to  force  certain  NP  computations. 

To  mix  coding  with  diagonalization,  we  employ  a  simplified  version  of  the 
trick  used  to  construct  an  oracle  for  P^^  ==  NEXP  [BT94,  FF95],  For  each  x,  we 
reserve  two  potential  regions — left  and  right — in  which  to  code  M^{x),  only  one 
of  which  will  actually  be  used.  To  code  correctly  in  a  region  we  must  let  exactly 
one  string  in  the  region  enter  A.  We  will  code  in  the  left  region  unless  we  have 
to  diagonalize  against  some  NP  machine,  which  may  necessitate  adding  several 
strings  of  the  left  region  to  A.  If  this  happens,  we  scrap  the  left  region  and  code 
in  the  right  region,  but  we  can  do  this  only  if  our  diagonalization  hasn’t  already 
put  strings  of  the  right  region  into  A. 

We  now  proceed  with  the  formal  treatment.  For  every  x  E  U*  with  \x\  =  n 
and  b  e  T!,  we  call  s  an  {x,b,leh)- coding  string  (respectively,  an  (x,  6,  right) - 
coding  string)  if  s  —  xyhOO  (respectively,  s  =  xyblO)  for  some  y  e  U*  of  length 
3n.  We  identify  left  and  right  with  0  and  1,  respectively.  We  build  the  oracle  A 
in  stages,  each  successive  stage  extending  a  finite  portion  of  A’s  characteristic 
function.  If  a:  E*  E  is  some  partial  characteristic  function,  N  an  oracle 
machine,  and  x  ^  E*,  then  the  computation  N^{x)  is  defined  as  usual,  except 
that  when  N  makes  any  query  outside  domain(Q!),  it  is  answered  negatively.  As 
is  customary,  we  regard  a  as  a  set  of  ordered  pairs.  If  (3  is  another  characteristic 
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function,  we  write  (5  >  a  to  mean  that  /?  extends  a.  Finally,  define  the  “tower 
of  2’s”  function  t{n)  for  n  >  0  by 

m  =  1 

t(n  +  l)  = 

Stage  —1. 

a_i:=0. 

End  Stage. 

Stage  n  >  0. 

We  are  given  Ofn-i-  Set  a:=an-i^ 

1.  [Forcing  an  NP  computation)  li  n  ^  t{k)  for  any  k,  then  set 

_  J  right  if  Q!(s)  =  1  for  some  [x,  6,  left)-coding  string  5  with  \x\  =  n, 
”'~|left  otherwise, 

and  go  to  step  2.  Otherwise,  let  n  -  t{k)  for  some  k  =  (ij).  If  there  exists 
a  minimal  /3  y  a  such  that  both 

(a)  (0")  has  an  accepting  path  in  which  all  queries  are  in  domain(/^),  and 

(b)  for  no  x  with  \x\  >  n  and  no  (x,  6,  right)-coding  string  s  does  P[s)  -  1, 
then  set  a:=(3  U  {(O'"*  1, 1)}  and  set  d„;=right  (note  that  /3  is  only  defined  on 
strings  no  longer  than  n").  Otherwise,  set  a:=aU{(0'"  1,0)}  and  set  d;^:=left. 

2.  [Preserving  computations  of  M)  For  all  x  of  length  n,  run  M"(x),  and  ex¬ 
tend  a  with  just  enough  O’s  to  “cover”  all  queries  made  by  M^[x)  not  in 
domain(Q!). 

3.  [Coding  computations  of  M)  For  all  a:  G  X"*  of  length  n,  let  y  G  X*  be  the 
lexicographically  least  string  (if  one  exists)  such  that  \y\  =  3n  and  neither 
the  (a^,0,dn)-coding  string  nor  the  [x,  l,d„)-coding  string  corresponding  to 
y  is  in  domain(a).  If  M"  accepts,  set  Q:=a  U  {[xyldn^,  1)};  otherwise,  set 
a\—a  U  {[xyOdnO,  1)}. 

4.  Set  an  to  be  a  extended  with  just  enough  O’s  to  cover  all  remaining  [x,  b,  d)- 
coding  strings  for  all  &  G  X,  d  G  {left, right),  and  x  of  length  n. 

End  Stage. 

Let  A  be  such  that  xa  extends  an  for  all  n  [xa[^)  =  0  for  any  x  ^  \J^  an)- 
For  any  BCE*,  define  the  language  by 

(if  either  B  contains  an  [x,  l,right)-coding  string,  or 

^  B  contains  no  (a:, 0,d)-coding  strings  for  any  d  G  {left,  right), 

0  otherwise. 

Clearly,  G  coD^’^.  We  now  show  that  L^[x)  =  M^[x)  for  all  a:  G  X”",  and 
hence  =  EXP^  = 

Pick  an  n  large  enough,  and  fix  an  input  x  of  length  n.  In  Step  3  of  Stage 
n,  such  a  y  must  exist:  there  are  at  most  2'^  •  (2""+^  -  1)  (a;,  6,  d)-coding  strings 
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queried  by  M  on  inputs  of  length  <  n,  because  of  the  running  time  of  M,  and 
less  than  n  •  ^  <  2^06”)^  total  strings  queried  by  the  Ni  in  Step  1  of  Stages 

0  through  n.  Thus  there  are  less  than  2^^  (a:,  b,  d)-coding  strings  in  domain(Q;) 
at  Step  3  of  Stage  n. 

The  fact  that 

M^{x)=L^{x)  (1) 

is  now  easily  seen:  first  we  observe  that  no  (x,  ft,  right)-coding  string  (for  any 
6  G  X’)  gets  into  A  in  Steps  1  or  2  of  any  stage.  Thus  we  have  two  cases: 

dn  =  left:  For  any  b  e  U  and  d  G  {left, right},  the  only  (x,  6,  d)-coding  string 
that  ever  enters  A  does  so  in  Step  3  of  Stage  n.  This  unique  string  is  an 
(.T,  l,left)-coding  string  if  M^{x)  accepts,  and  is  otherwise  an  (a;,0,left)- 
coding  string;  thus,  (1)  is  satisfied. 

dn  =  right:  Exactly  one  (x,  b,  right)-coding  string  enters  A.  It  is  an  {x,  l,right)- 
coding  string  iff  M^{x)  accepts.  Again,  (1)  is  satisfied. 

It  remains  to  show  that  has  no  P"^-bi-immune  sets.  This  will  be  done  if 
we  can  show  that  for  any  L  G  NP^,  there  exist  P^  sets  Q  and  R  with  Q  infinite, 
such  that  L  n  Q  =  i?  (or  at  least  the  symmetric  difference  of  L  fl  and  R  is 
finite).  Let  L  —  L{Nf')  for  some  fixed  i.  Let 

Q  =  {0^\{3j)n  =  t{(iJ))}, 

R  =  Qn{o^\o^'ieA}. 

The  sets  Q  and  R  are  clearly  in  P^.  Pick  n  =  t{(i,j))  for  j  large  enough  so  that 
-\-l)  =  2^  >  n\  and  consider  Step  1  of  Stage  n.  If  (3  exists,  then  A^/^(0^) 
accepts  and  0"'  1  G  A,  so  O""  G  R.  If  no  such  (3  exists,  then  ^  R.  To  see  that 
A'/^(0”)  rejects,  we  simply  observe  that  dn  =  dn+\  =  •  ”  —  =  left,  so 

no  (x,  b,  right)-coding  strings  enter  A  in  any  of  the  stages  n  through  nL  Therefore, 
A  preserves  our  conditions  on  the  nonexistence  of  /?,  and  so  N^{0^)  rejects. 

Corollary  6,  There  exists  an  oracle  relative  to  which  NP  has  p- measure  0  and 
=  EXP  (and  thus  has  p-measure  1  in  E  and  in  EXP}. 

We  actually  get  something  more  from  the  construction  above:  relative  to  A, 
we  have  EXP  C  (NP  n  coNP)/l.  That  is,  EXP  can  be  computed  in  NP  fl  coNP 
with  one  bit  of  advice  for  strings  of  length  n,  namelyyd^On  input  x  of  length  n, 
an  NP^  machine  accepting  L{M^)  (respectively  L{M^))  simply  checks  if  there 
is  some  (x,  l,dn)-coding  string  (respectively,  some  (x,  0,  d„)-coding  string)  in  A. 

A  natural  question  is  whether  Theorem  5  and  Corollary  6  are  tight.  It  could 
still  happen  that  yUp(NP)  =  0  and  //p(P^^f^])  /  0.  The  next  theorem  discards 
this  possibility. 


Theorem  7.  If  7^  0  then  Pp{NF)  ^  0. 
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Proof.  ^  0  implies  that  SAT  is  weakly  <^^^-complete  for  EXP.  Ambos- 

Spies,  Mayordomo,  and  Zheng  [ASMZ96]  have  shown  that  the  weakly  <1^^- 
completeness  notion  coincides  with  weakly  ^^-completeness  for  EXP.  Hence 
SAT  is  weakly  <^^-complete  for  EXP  and  thus  //p(NP)  /  0. 

Corollary  8.  Relative  to  the  oracle  constructed  in  Theorem  5  it  holds  that  = 
coBP  +  . 

4  BPP  likely  has  measure  0 

In  this  section  we  investigate  the  consequences  of  BPP  not  having  p-measure 
0.  We  will  see  that  this  is  unlikely  since  it  would  collapse  the  polynomial- time 
hierarchy.  Hence  we  provide  strong  evidence  that  /ip(BPP)  =  0. 

Theorem  9.  //^p(BPP)  ^  0  then  EXP  =  MA. 

Since  MA  G  Xf  H  iJf  [BM89],  EXP  =  MA  implies  that  PH  =  X’.f . 

We  use  the  following  Theorem  from  Babai,  Fortnow,  Nisan  and  Wigder- 
son  [BFNW93]  stating  that  if  EXP  #  MA  then  BPP  can  be  simulated  in  subex¬ 
ponential  time  for  infinitely  many  input  lengths. 

Theorem  10  [BFNW93].  //EXP  MA  then  for  all  L  e  BPP,  and  for  all  e 
there  exists  a  set  L'  €  DTIME{2^")  such  that  for  infinitely  many  n,  Lf]  = 

We  will  see  that  if  BPP  can  be  simulated  in  subexponential  time  for  in¬ 
finitely  many  input  lengths,  then  it  has  p-measure  0.  Taking  this  together  with 
Theorem  10  yields  that  EXP  /  MA  implies  that  /ip  (BPP)  -  0,  which  proves 
Theorem  9. 

Theorem  11.  If  for  all  languages  L  £  BPP  there  exists  an  e  <  1  and  a  set 
V  £  DTIME(2’"')  such  that  for  infinitely  many  n,  L  H  =  L'  D  then 
fipiBFF)  =  0. 

Proof.  (Sketch)  We  will  construct  a  martingale  that  succeeds  on  all  sets  in  BPP 
that  runs  in  time  for  some  fixed  k.  Let  L  £  BPP  and  let  be  the  machine 
that  runs  in  subexponential  time  and  accepts  L' .  If  we  are  betting  on  strings  of 
length  n  such  that  L  n  Z'"'  =  L'  fl  Z*”  then  we  can  use  Ml'  to  predict  exactly 
the  next  bit,  and  hence  we  win  2”'  times.  The  problem  however  is  that  we  do 
not  know  for  which  n,  is  going  to  be  correct.  We  overcome  this  problem  by 
the  following  strategy. 

Assume  that  our  initial  capital  is  1.  We  reserve  2~^  to  bet  against  the  strings 
of  length  n,  using  Ml‘  to  predict  the  next  bit  (i.e.  whether  the  next  string  of 
length  n  is  in  L').  We  bet  everything  won  so  far  on  the  strings  of  length  n  to 
the  outcome  of  Ml>.  At  the  last  string  of  length  n  we  set  aside  what  (if  any)  we 
have  won  betting  on  the  strings  of  length  n. 
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Observe  that  if  n  is  a  length  such  that  then  we  win  2^  *2“”' 

and  this  is  greater  than  n.  So  for  infinitely  many  n  we  add  n  to  our  capital  and 
hence  the  lim-inf  of  this  martingale  goes  to  infinity. 

To  make  the  construction  work  uniformly  for  all  L  E  BPP  we  simulate  all 
the  DTIME(2’^)  machines  with  a  single  DTIME(2^^)  machine  allocating  2“*  of 
our  initial  capital  to  machine  i  (see  [Lut92,  May94b]). 
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Abstract.  In  [3]  we  exhibited  a  simple  boolean  functions  fn  in  n  vari¬ 
ables  such  that: 

1)  fn  can  be  computed  by  polynomial  size  randomized  ordered  read-once 
branching  program  with  one  sided  small  error; 

2)  any  nondeterministic  ordered  read-once  branching  program  that  com¬ 
putes  fn  has  exponential  size. 

In  this  paper  we  present  a  simple  boolean  function  Qn  in  n  variables  such 
that: 

1)  Qn  can  be  computed  by  polynomial  size  nondeterministic  ordered  read- 
once  branching  program; 

2)  any  two-sided  error  randomized  ordered  read-once  branching  program 
that  computes  fn  has  exponential  size. 

These  mean  that  BPP  and  NP  are  incomparable  in  the  context  of  or¬ 
dered  read-once  branching  program. 


1  Preliminaries 

Branching  programs  is  well  known  model  of  computation  for  discrete  functions 
[14].  Many  types  of  restricted  branching  programs  have  been  investigated  as 
important  theoretical  model  of  computations  [9].  Ordered  read-once  branching 
program  or  ordered  binary  decision  diagrams  (OBDD)  [4,  15]  also  important  for 
practical  computer  science.  They  are  used  in  circuits  verifications.  But  many 
important  functions  cannot  be  computed  by  determinsitc  read-once  branching 
programs  of  polynomial  size  [4,  13,  8]. 

In  [2]  we  introduced  the  model  of  randomized  branching  programs  and  showed 
that  randomized  ordered  read-once  branching  programs  can  be  more  effective 
than  determinstic  ones.  In  [3]  we  defined  exclusive  boolean  function  fn  in  n 
variables  which  can  be  computed  by  polynomial  size  randomized  ordered  read- 
once  branching  program,  but  any  nondeterminstic  ordered  read-once  branching 
program  needs  exponetial  size  to  compute  /„,  Martin  Sauerhoff  [10]  considered 
function  from  theorem  3  [6].  He  proved  that  this  function  needs  (also  as  in  the 
deterministic  case)  exponetial  size  randomized  read-once  branching  programs  for 
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research  supported  by  Russia  Fund  for  Basic  Research  96-01-01962.  ablayev@ksu.ru 
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one-sided  error.  In  this  paper  we  presented  exclusive  function  which  is  ’’sim¬ 
ple”  for  nondeterminstic  ordered  read-once  branching  programs,  but  is  ’’hard” 
for  randomized  read-once  branching  programs  with  two-sided  error  of  computa¬ 
tion. 

Together  with  the  result  from  [3]  this  proves  that  complexity  classes  BPP  and 
NP  are  incomparable  in  the  context  of  ordered  read-once  branching  programs. 

Note  that  the  results  of  the  paper  for  ordered  read-once  branching  programs 
are  true  for  a  more  common  model  —  weak-ordered  branching  program  that 
we  define  in  the  paper.  Informaly  speaking  weak-ordered  property  for  branching 
program  P  means  existence  of  partition  of  its  set  {a:i,X2, . . .  ^Xn}  of  variables 
into  two  parts  Xi  and  X2,  Xi[\X2^  0,  such  that  for  any  computation  path  of 
P  the  following  is  true.  If  a  variable  from  X2  is  tested  then  no  variable  from  Xi 
can  be  tested  in  the  rest  part  of  this  path. 

A  deterministic  branching  program  P  for  computing  a  function  ^  :  {0,  -> 

{0, 1}  is  a  directed  acyclic  multi-graph  with  a  distinguished  source  node  s  and  a 
distinguished  sink  node  t.  The  out  degree  of  of  each  non-sink  node  is  exactly  2  and 
the  two  outgoing  edges  are  labeled  by  =  0  and  Xi  =  1  for  variable  Xi  associated 
with  the  node.  Call  such  node  an  ar^-node.  The  label  “x*  =  <5”  indicates  that  only 
inputs  satisfying  Xi  =  6  may  follow  this  edge  in  the  computation.  The  branching 
program  P  computes  function  g  in  the  obvious  way:  for  each  a  e  {0, 1}^  we  let 
f(a)  =  I  iff  there  is  a  directed  s-t  path  starting  in  the  source  s  and  leading  to 
to  the  accepting  node  t  such  that  all  labels  Xi  =  (Ji  along  this  path  are  consistent 
with  <7  =  CTi  ,  (72 , .  ■ . ,  c’-ji . 

The  branching  program  becomes  nondeterministic  [5]  if  we  allow  ’’guessing 
nodes”  that  is  nodes  with  two  outgoing  edges  being  unlabeled.  Unlabeled  edges 
allow  all  inputs  to  produced.  A  nondeterministic  branching  program  P  computes 
a  function  g^  in  the  obvious  way;  that  is,  g{cr)  =  1  iff  there  exists  (at  least  one) 
computation  on  a  starting  in  the  source  node  s  and  leading  to  the  accepting 
node  t. 

Define  a  randomized  branching  program  [2]  as  a  one  which  has  in  addition  to 
its  standard  inputs  specially  designated  inputs  called  ’’random  inputs”.  When 
values  of  these  ’’random  inputs”  are  chosen  from  the  uniform  distribution,  the 
output  of  the  branching  program  is  a  random  variable. 

Say  that  a  randomized  branching  program  (a,  6)-computes  a  boolean  function 
/  if  it  outputs  1  with  probability  at  most  a  for  input  a  such  that  f{a)  =0  and 
outputs  1  with  probability  at  least  h  for  inputs  a  such  that  f{a)  =  1. 

As  usual  for  a  branching  program  P  (deterministic  or  random),  we  define 
size(P)  (complexity  of  the  branching  program  P)  as  the  number  of  internal 
nodes  in  P.  Define,  following  [5],  the  size(P)  of  the  nondeterminstic  branching 
program  P  as  the  number  of  internal  nodes  in  P  minus  the  number  of  guessing 
nodes. 

Read-once  branching  programs  is  branching  program  in  which  for  each  path 
each  variable  is  tested  no  more  than  once.  An  ordered  read-once  branching  pro¬ 
gram  is  a  read-once  branching  program  which  respects  a  fixed  ordering  tt  of 
the  variables,  i.e.  if  an  edge  leads  from  an  x^-node  to  an  x^-node,  the  condition 
7r(z)  <  7r(j)  has  to  be  fulfilled. 
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2  Results 

We  specify  a  boolean  function  fnOin  =  4/  variables  as  follows.  For  a  sequence 
(7  €  {0, 1}"^^  call  odd  bits  a  “type”  bits  and  even  bits  a  “value”  bits.  Say  that 
even  bit  di  G  cr,  i  G  {2, 4, . . . ,  4/},  has  type  0  (1)  if  corresponding  odd  bit  <7i_i  is 
0  (1).  For  a  sequence  a  €  {0, denote  cr°  (d^)  subsequence  of  a  that  consists 
of  all  even  bits  of  type  0  (1). 

For  every  a  E  {0,1}”  boolean  function  fn  '■  {0,1}”  — )■  {0,1}  is  defined  as 
/„(d)  :=  1  iff  d°  =  db 

Definition!.  Call  branching  program  a  7r-weak-ordered  branching  program  if 
its  respects  a  partition  tt  of  variables  {xi  ,X2,  -  ■  ■  ,Xn}  into  two  parts  Xi  and  X2 
such  that  if  an  edge  leads  from  an  a^^-node  to  an  Xj-node,  where  Xi  E  Xt  and 
Xj  E  Xm,  then  the  condition  t  <  m  has  to  be  fulfilled. 

Call  branching  program  P  an  weak-ordered  if  it  is  7r-weak-ordered  for  some 
partition  tt  of  the  set  of  variables  of  P  into  two  sets. 

Clearly  that  ordered  read-once  branching  program  is  also  weak-ordered.  We 
proved  the  following  result  in  [3]  (we  use  here  a  restrictive  variant  of  this  result). 

Theorem  2.  For  the  function  fn  the  following  is  true: 

1.  fn  can  be  (e{n)X)- computed  by  randomized  ordered  read-once  branching 
program  of  the  size 

2.  Any  nondeterministic  ordered  read-once  branching  program  that  computes 

function  fn  has  the  size  no  less  than  . 

Now  define  function  gn  which  is  ’’hard”  for  randomized  computation  but  is 
’’simple”  for  nondeterminstic  computation  for  our  model  of  branching  program. 
This  boolean  function  presented  in  [11].  Let  n  be  an  integer  and  let  p[n\  be  the 
smallest  prime  greater  or  equal  to  n.  Then,  for  every  integer  s,  let  be 

defined  as  follows.  Let  j  be  the  unique  integer  satisfying  j  =  smodp[n]  and 
1  <  i  <  P[n].  Then,  a;n(s)  =  i,  if  1  <  i  <  n,  and  u;n(s)  ^  1  otherwise. 

For  every  n,  the  boolean  function  ^„:{0,l}”^{0,l}is  defined  as  gn((T)  = 
(Tj,  where  j  =  ^^0- 

We  will  use  the  following  notations  in  the  rest  part  of  the  paper.  Let  h  : 
{0, 1}”  {0, 1}  be  a  boolean  function.  Consider  a  partition  tt  of  variables 

{xi,X2, . . .  ,Xn}  into  two  parts  Xi  =  {xi  :  i  ^  1}  and  X2  =  {xj  :  j  E  J},  where 
/  C  {1, 2, . . . ,  n},  |/|  =  /  and  J  =  {1, 2, . . . ,  n}\/},  |  J|  =  t. 

Denote  T,  R  sets  of  binary  sequences  of  length  I  and  t  with  indexes  from  I 
and  J  respectively.  For  u  &  L  and  it;  E  i?  let  {u,  w)  mean  the  sequence  a  from 
{0, 1}”  in  wich  bits  with  indexes  from  I  respectively  J  have  the  same  values  as 
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in  u  respectively  w.  We  will  also  use  the  notation  h{u,w)  instead  of  h{a)  where 
it  will  be  convenient. 

Consider  one-way  randomized  communication  computation.  We  use  the  fol¬ 
lowing  standard  model  of  one-way  randomized  communication  computation  for 
function  h.  Two  players  A  and  B  receive  respectively  u  e  L  and  w  G  R.  In  the 
randomized  one-way  model,  A  sends  the  messages  /?i,  ^^2?  •••5  with  probabilities 
Pi,p2,  •■•,Pd  respectively  (Yli=iPi  "  !)■  ^he  receipt  of  ft,  outputs  1  with 

probability  qi  and  0  with  probability  l-Qi-  The  probability  distribution  on  the 
set  of  messages  sent  by  A  is  entirely  determined  by  the  input  at  A  alone,  and 
is  not  influenced  by  the  input  at  B.  Similarly,  the  probabilities  Qi  at  B  depend 
only  on  its  input  and  the  message  ft  received. 

In  the  computation  T^{u,w),  the  probability  of  outputting  the  bit  6  =  1  is 

EJLi  Piiu)qi{w)  and  the  bit  =  0  is  1  -  E?=i 

Let  p  =  I  +  £  for  0  <  £  <  1/2.  Say  that  the  probabilistic  protocol  0  p- 
computes  a  function  h  if  for  every  input  a  —  (u^w)  it  holds  that  /i(cr)  —  6  iff  the 
probability  of  outputting  the  bit  b  in  the  computation  T<p{u,w)  is  no  less  than 

P^ 

Let  a  set  f/  C  {0, 1}’^  be  such  that  U  -  L  x  R.  The  randomized  communi¬ 
cation  complexity  of  the  probabilistic  protocol  (p  on  the  inputs  from  U  is 
[log  |M((/>)|],  where  M(</>)  is  the  set  of  messages  used  by  (p  during  computations 
on  inputs  from  U.  For  p  G  [1/2, 1]  the  randomized  communication  complexity 
PCp  .^{h)  of  a  boolean  function  h  is 

min{C'(0)  :  protocol  (p  p-computes  h  for  the  partion  tt  of  inputs  from  U}. 

The  proof  of  following  lemma  is  based  on  simulation  technique  of  weak- 
ordered  branching  program  by  communication  protocol  and  is  similar  to  simu¬ 
lation  technique  from  [1]  (lemma  6.1). 

LemmaS.  Let  e  E  [0, 1/2],  p  —  1/2  -h£.  Let  randomized  t: -weak- ordered  branch¬ 
ing  program  P  (1  —p,p)- computes  function  h  :  {0, 1}^  ->  {0, 1}.  Let  U  C  {0, 1}^ 
be  such  that  U  =  L  x  R,  where  L  and  R  are  defined  in  according  to  partition  tt 
of  inputs.  Then 

size{P)  > 

Proof.  Describe  the  following  communication  protocol  which  p-computes 
function  h  for  the  partion  tt  of  inputs. 

Let  (7  E  f/  be  a  valuation  oi  a  =  {u^w)^  u  E  L,  w  ^  R.  Players  A  and  B 
receive  respectively  u  and  w  in  according  to  partition  tt  of  inputs.  Let  vi, ...  ,Vd 
be  all  internal  nodes  of  P  that  are  reachable  during  paths  of  computation  on  the 
part  u  of  input  a  with  non  zero  probabilities  pi(u), , . .  ,Prf(w). 

During  the  computation  on  the  input  u,  player  A  sends  node  Vi  with  prob¬ 
ability  pi{u)  to  player  B.  Player  B  on  obtaining  message  Vi  from  A  starts  its 
computation  (simulation  of  the  branching  program  P)  from  the  node  Vi  on  the 
part  w  of  the  nput  a. 


From  the  definition  of  the  protocol  ^  results  the  statement  of  the  lemma.  | 

We  use  the  lower  bound  for  probabilistic  one-way  complexity  from  [1]  in  the 
proof  of  the  theorem  6  below.  Recall  notations  and  the  statement  we  need  from 
[1]  in  the  convinient  for  us  form. 

FovU  =  LxR  with  a  boolean  function  h  we  associate  a  \L\  x  \R\  communica¬ 
tion  matrix  CM  whose  (u,  u;)-th  entry,  CM[u,  iw]  is  h{u,  w)  for  all  (w,  w)  e  Lx  R. 
As  it  is  mentioned  in  [16]  the  one-way  deterministic  communication  complex¬ 
ity  DC^(h)  for  partition  tt  of  inputs  from  C/  of  a  boolean  function  h  is  easily 
seen  to  be  [log(nrou;(CM))] ,  where  nrow{CM)  is  the  number  of  distinct  rows 
of  communication  matrix  CM  of  the  function  h. 

Consider  w.l.g.  the  case  when  all  rows  of  CM  are  different,  nrow(CM)  =  \L\. 

Choose  a  y  C  i?  such  that  for  an  arbitrary  two  words  u,u'  e  L  there  exists 
a  word  y  such  that  h{u,y)  /  h{u',y).  The  set  Y  is  called  the  control  set  for 
the  matrix  CM. 

Denote 


ts{CM)  =  min{|y|  -.Y  is  a  control  set  for  CM}. 

It  is  evident  that  [log  nr  ore  (CM)]  <  ts{CM)  <  nrow{CM). 

For  number  p  G  [1/2, 1],  define  pcc^(A)  =  \og nr ow {cm ) ^ (P) ^  where  H{p)  = 
-p\ogp-{l-p)  log(l-p)  is  the  Shannon  entropy.  Calipee^  (h)  the p-probabilistic 
communication  characteristic  of  the  function  h. 

Theoremd.  [1]  Let  e  G  [0,1/2],  p  =  1/2  -H  £.  Let  U  C  {0,1}^  be  such  that 
U  =  L  X  R,  where  L  and  R  are  defined  in  according  to  partition  tt  of  inputs  of 
function  h  ;  {0, 1}^^  — )■  {0, 1}.  Then 

PC^Jh)  >  DC{;{h)il-pcc{^{h))  -  1. 

In  the  proof  of  the  theorem  6  below  we  use  the  following  result  from  number 
theory  (see  [7]  and  [12]  for  additional  citation). 

For  every  natural  number  n  let  p(n)  be  the  smalest  prime  greater  or  equal 
than  n.  Consider  the  field  of  the  residue  classes  modulo  p. 

Lemma  5.  For  every  n  large  enough,  the  following  is  true.  If  A  C  Zp(^n)  ^.nd 
\A\  >  3y/n,  then,  for  every  t  G  ^p(n)?  there  is  a  subset  B  C  A  such  that  the  sum 
of  the  elements  of  B  is  equal  to  t. 

Theorem  6.  Let  e  C  [0,1/2],  p  =  1/2  e.  Then  for  arbitrary  S  >  0  for  every  n 
large  enough  it  holds  that  any  randomized  ordered  read-once  branching  program 
that  (1  —  p,p)- computes  function  g-n  has  the  size  no  less  than 


2n-\3Vn] 


l-{l+S)H{p) 
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Proof.  Let  P  be  a  randomized  ordered  read-once  branching  program  with  an 
ordering  r  of  variables  which  computes  function  For  ordering  r  —  {ii ,  *2  ?  ■  •  •  Pn  } 
consider  the  partition  tt  of  variables  x  of  Qn  into  two  parts  Xi  =  {xi^ , . . . ,  J 
and  X2  =  },  where  I  =  n-  \Sy/n].  Denote  t  = 

Describe  below  a  subset  U  C  {0,  !}”■  in  the  form  U  =  L  x  R  where  \L\  =  I, 

Denote  by  I  and  J  sets  of  indexes  of  variables  from  sets  Xi  and  X2  respec¬ 
tively.  For  s  G  {1, . . . ,  n}  denote  Lg  a  subset  of  binary  sequences  of  length  I  with 
indexes  from  I  such  that  Lg  =:  {u  :  iui)  =  Denote  L  a  maximum 

among  sets  Li , . . . ,  Ln- 

\L\=  max  {|L,|}. 


Clearly  that 


on—  [Sv^l 

1^1  ^  -V- 

Let  L  =  Lg.  Then  denote  R  =  {w  :  +5)  =  G  /}.  From  the 

definition  of  R  we  have  the  following  properties: 

1)  \R\  -  /; 

2)  for  arbitrary  u  and  u'  from  L  there  exists  w  ^  R  such  that  gn{u,w)  ^ 
g,^{u',w). 

We  will  prove  the  second  property  (the  first  one  is  evident).  Let  i  G  /  be 
an  index  such  that  Pth  bits  in  sequences  u  and  u'  are  different,  Ui  ^  u\.  From 
the  lemma  5  it  follows  that  for  every  n  large  enough,  for  our  number  s  and  the 
number  i  there  exists  a  sequence  w  e  R  such  that  s  +  ^  modp(n). 

Then  from  the  definition  of  gn  it  follows  that  gn{u,w)  ^  gn(u',w). 

Now  define  set  f/  as  f/  =  L  x  P.  From  the  above  it  follows  that  for  the  set  U 
\L\  X  |P|  communication  matrix  CM  of  gn  has  the  following  properties: 

1)  nrow{CM)  —  \L\; 

2)  the  set  R  is  the  control  set  for  CM. 

This  means  that  DC^{gn)  =  log \L\  and  that  for  p-probabilistic  communication 
characteristic  of  pcc^  (gn)  of  function  gn  it  is  true  that 


pcc^ign)  =  {l/log\L\)H{p)  <  ((n-  r3v/h:i)/(n-  rsv^  -logn))P(p). 

From  this  it  follows  that  for  arbitrary  (5  >  0  for  every  n  large  enough  it  holds 
that 


pcc^{9n)<{i  +  S)H{n). 

From  the  above  property  and  the  theorem  4  it  follows  that  for  every  n  large 
enough  the  following  is  true 

PCpiOn)  >(n-  ^3y/n\  -logn){l  -  {l  +  5)H{p))  -  1. 
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From  this  and  the  lemma  3  the  lower  bound  for  size{P)  results.  | 

Note  that  in  the  proof  of  the  theorem  6  from  the  property  of  P  that  it 
is  ordered  read-once  we  use  only  the  following  fact.  Set  x  of  variables  of  P 
can  be  partition  into  two  parts  Xi  and  X2  such  that  |Xi|  —  n  —  and 

|A"2|  =  [3^/^.  The  cardinality  of  X2  is  essential  for  application  of  lemma  5. 
This  means  that  the  following  statement  is  true. 

Theorem  7.  Lets  6  [0, 1/2],  p  =  1/2+6.  Let  P  be  a  randomized  n -weak- ordered 
branching  program  that  (1  —  p,p)- computes  function  pn-  Let  n  be  a  partition  of 
X  in  two  two  parts  Xi,  X2  such  that  IX2I  =  t  >  [31/^  Q-'^d  |Xi|  ==  I  =  n  —  t. 
Then  for  arbitrary  >  0  for  every  n  large  enough  it  holds  that 

size(P)  >  1/4 

Theorem  8.  There  is  polynomial  size  nondeterministic  ordered  read-once  branch¬ 
ing  program  that  computes  function  pn- 

Proof.  The  proof  is  simple.  For  arbitrary  input  a  nondeterministic  ordered 
read-once  branching  program  P  that  computes  function  pn  works  as  follows. 
On  the  first  (nondeterminstic)  phase  P  nondeterministicaly  selects  number  s  G 
{l,...,n}.  Then  on  the  second  (deterministic)  phase  P  reads  inputs  in  the 
order  xi,...,Xn-  During  computation  path  on  input  a  P  1)  counts  number 
a  =  i(^i)  and  2)  store  s-ths  bit  ds.  If  a  ==  5  then  P  ouputs  bit  dg  of  the 

input  d  else  P  outputs  0.  Clearly,  that  P  has  polynomial  size.  | 
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Abstract.  In  this  paper  we  show  how  to  construct  efficient  checkers 
for  programs  that  supposedly  compute  properties  of  polynomials.  The 
properties  we  consider  are  roots,  norms,  and  other  analytic/algebraic 
functions  of  polynomials.  In  our  model,  both  the  program  H  and  the 
polynomial  p  are  available  to  the  checker  each  as  a  black  box.  We  show 
how  to  check  programs  that  compute  a  specific  root  (e.g,,  the  largest)  or 
a  subset  of  roots  of  the  given  polynomial. 

The  checkers,  in  addition  to  never  computing  the  root(s)  themselves, 
strive  to  minimize  both  the  running  time  (preferably  o(deg^p))  and  the 
number  of  black  box  evaluations  of  p  (preferably  o(degp)).  We  obtain  de¬ 
terministic  checkers  when  a  separation  bound  between  the  roots  is  known 
and  probabilistic  checkers  when  the  roots  can  be  arbitrarily  close.  We 
then  extend  the  checkers  to  handle  the  situations  when  the  program  II 
returns  an  approximation  to  the  root  and  when  the  evaluation  of  the 
polynomial  p  is  approximate.  Our  results  translate  into  efficient  check¬ 
ers  for  matrix  spectra  computations  both  in  the  exact  and  approximate 
settings,  operating  in  the  library  model  of  [BLR93].  Next  we  show  that 
the  usual  characterization  of  norms  using  the  triangle  inequality  is  not 
suited  for  self-testing  in  the  exact  case,  but  surprisingly,  could  be  used 
in  the  approximate  case. 

Our  results  are  complementary  to  most  of  the  existing  results  on  test¬ 
ing  polynomials.  The  testers  in  the  latter  have  the  goal  of  determining 
whether  a  program  computes  a  polynomial  of  given  degree,  whereas  we 
are  interested  in  checking  the  properties  of  a  given  polynomial. 


1  Introduction 

The  paradigm  of  program  checking  and  its  extensions,  self-testing,  and  self- 
correcting,  have  received  considerable  attention  (e.g.,  [Blu88,  BK89,  BLR93, 
Lip91,  GLR+91,  RS96,  ABC“^93,  GGR96,  EKR96].)  The  results  in  this  field 

*  This  work  was  done  while  the  first,  second,  and  fourth  authors  were  visiting  Sandia 
National  Labs.  The  second  and  fourth  authors  are  also  supported  by  the  NSF  Career 
Award  CCR-9624552,  the  Alfred  P.  Sloan  Research  Award,  and  the  NSF  grant  DMI- 
91157199. 
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have  practical  value  as  tools  for  efficient  verification  of  the  correctness  of  pro¬ 
grams.  Furthermore,  they  have  been  applied  to  develop  efficient  probabilistically 
checkable  proofs  [ALM+92]. 

In  this  paper  we  investigate  the  problem  of  checking  and  testing  (both  in 
the  exact  and  approximate  cases)  programs  that  compute  properties  (i.e.,  func¬ 
tions  or  relations)  of  polynomials.  The  properties  we  consider  include  the  set 
of  all  roots,  the  largest  root,  the  smallest  root,  norms,  multiplication,  differen¬ 
tiation,  resultants,  etc.  Our  checkers  for  root-finding  problems  only  assume  an 
oracle  access  to  the  polynomial  p.  Note  that  this  is  a  weaker  requirement  than 
the  availability  of  an  explicit  representation  of  p.  This  model  lets  us  view  the 
checkers  for  matrix  spectra  computations  in  the  library  setting  of  [BLR93].  In 
this  framework,  checkers  call  already  tested  programs  in  the  library,  counting 
each  call  as  a  unit  time  call.  Such  calls  naturally  correspond  to  the  evaluation 
of  the  polynomial  in  our  model.  Consequently,  it  is  imperative  that  the  number 
of  evaluations  of  p  be  minimized. 

Our  approach  is  complementary  to  previous  work  on  checking  and  testing 
polynomials.  The  main  difference  is  the  following.  Most  of  the  existing  results  are 
concerned  with  checking/testing  programs  purportedly  evaluating  polynomials. 
In  this  paper  we  are  interested  in  checking  programs  that  take  a  polynomial  as 
an  input  and  compute  its  properties. 

Our  Results.  We  describe  efficient  checkers  for  programs  that  compute  one, 
few,  all,  or  specific  roots  (e.g.,  the  largest)  of  a  polynomial  p.  We  address  at 
length  the  checking  of  programs  computing  the  largest  root.  For  this  problem,  we 
construct  some  checkers  that  run  in  time  o(deg^  p)  and  make  only  o(degp)  calls  to 
p  (thus  ruling  out  an  explicit  interpolation)  using  powerful  tools  from  analysis. 
This  translates  into  more  efficient  checkers  than  ones  offered  by  several  other 
methods  that  use  explicit  interpolation.  We  obtain  deterministic  checkers  when  a 
separation  bound  between  the  roots  is  known  and  probabilistic  checkers  when  the 
roots  can  be  arbitrarily  close  (Section  3).  We  also  consider  the  situations  where 
(i)  the  program  n(jp)  is  computing  an  approximation  to  the  root(s)  of  polynomial 
p  and  (ii)  the  oracle  returns  an  approximate  evaluation  of  the  polynomial  p 
(Section  4).  In  these  cases,  we  provide  checkers  for  some  of  the  problems. 

Next  we  consider  programs  that  claim  to  compute  some  (unspecified)  norm 
on  the  domain  (Section  5).  There  are  several  norms  for  polynomials  (see  [Z93]); 
the  goal  is  to  test  whether  there  exists  a  norm  that  agrees  with  the  program  on 
most  inputs.  We  show  that  the  standard  characterization  of  norms  (using  trian¬ 
gle  inequality)  cannot  be  used  to  construct  exact  testers.  I.e.,  there  are  extremely 
“bad”  programs  (those  that  do  not  agree  with  any  one  norm  for  any  non-trivial 
fraction  of  the  inputs)  that  still  pass  the  test.  The  same  test,  however,  can  be 
used  to  verify  that  the  program  approximates  some  norm  for  a  non-trivial  frac¬ 
tion  of  the  domain.  Our  result,  which  applies  to  norms  defined  on  any  domain,  is 
intriguing  because  most  of  the  current  techniques  for  testing  use  an  exact  char¬ 
acterization  to  build  an  exact  tester  and  an  approximate  characterization  (where 
the  equalities  are  relaxed  to  approximations,  see  [ABC+93,  EKR96]  for  further 
exposition)  to  build  an  approximate  tester.  The  exact  characterization  for  norms 


205 


is  too  lenient  to  lead  to  an  exact  tester,  however,  surprisingly,  is  strong  enough 
for  an  approximate  one  (even  without  resorting  to  an  approximate  characteri¬ 
zation).  Additionally,  this  is  the  first  instance  where  an  unbounded  inequality 
(i.e.,  an  inequality  of  the  form  |/i(-)|  >  0,  where  h  is  an  expression)  has  been 
addressed  in  testing. 

The  nature  of  these  properties  entails  the  use  of  techniques  from  several 
disciplines  (like  numerical  and  complex  analysis,  geometry,  and  in  particular, 
geometry  of  polynomials)  that  are  new  to  checking. 

Applications.  We  show  how  to  check  programs  that  perform  matrix  spectra 
computations,  which  are  fundamental  in  scientific  computing  (Section  6).  We 
exploit  the  fact  that  the  eigenvalues  of  a  matrix  are  the  roots  of  its  characteristic 
polynomial.  The  characteristic  polynomial  is  evaluated  using  a  library  program 
for  the  determinant  that  has  been  tested,  for  instance  using  the  exact  checker 
of  [Kan90j  or  the  approximate  checker  of  [ABC"^93].  Several  vital  parameters 
in  control  theory  (e.g.,  stability  of  a  system)  are  related  to  the  location  of  the 
roots  of  certain  polynomials.  Programs  that  compute  these  parameters  are  very 
common  in  practice  [BCL82];  our  checkers  could  be  used  to  check  such  programs. 
Another  application  of  property  testing  of  polynomials  is  in  verifying  parts  of 
computational  algebra  systems.  We  have  taken  an  initial  step  in  this  direction 
but  many  interesting  questions  remain. 

Previous  and  Related  Work.  The  problem  of  testing  root-finding  programs  is 
considered  as  early  as  1975  in  [JT75].  Here,  the  authors  lay  down  some  concrete 
requirements  for  an  efficient  testing  of  such  programs.  The  setting  proposed, 
however,  is  very  different  from  ours  and  is  mostly  heuristic  and  informal. 

A  number  of  papers  deal  with  testing  whether  a  program  is  computing  a  low- 
degree  polynomial  in  the  exact  [GLR'^91,  AS92,  GLR''"91,  RS96]  and  approx¬ 
imate  [EKR96]  settings.  Testing  certain  polynomial  functions  like  polynomial 
multiplication  and  FFT  is  investigated  in  [BLR93,  Frg95].  Checkers  for  several 
linear  algebra  computations  like  matrix  rank,  determinant,  matrix  multiplica¬ 
tion  are  given  in  [Fre79,  BK89,  Kan90,  BLR93].  Approximate  testers  for  several 
linear  algebra  computations  can  be  found  in  [ABC+93].  Testing  graph  properties 
is  considered  in  [GGR96], 

2  Preliminaries 

Our  Model.  We  consider  properties  /  of  polynomials  p.  In  this  context,  we 
assume  that  properties  are  relations  such  as  those  binding  p  to  one  or  more  of 
its  roots.  For  shorthand,  we  sometimes  use  “/(p)”  to  denote  one  of  the  values 
to  which  /  binds  p. 

Although  checkers  are  defined  for  properties  and  are  otherwise  independent 
of  the  programs  that  they  check,  we  sometimes  refer  to  a  checker  for  a  program 
n.  Implicit  in  these  references  is  that  the  checker  is  for  the  property  /  that  U 
purports  to  compute,  i.e.  that  the  checker  verifies  that  II {x)  G  f{x)  for  the  input 
X  in  question. 
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Definition  1.  Let  77  be  a  program  that  purports  to  compute  a  property  f.  Let 
/?  >  0  be  a  security  parameter.  Then,  a  {q{n),t{n);€i,  €2) -checker  for  /  is  a 
(probabilistic)  oracle  program  that  has  oracle  access  to  both  IT  and  p  such 
that  it 

1.  makes  0{q{n))  oracle  accesses  to  p  (i.e.,  it  evaluates  p  at  0(g(n))  points) 

2.  runs  in  time  0{t{n)),  counting  oracle  calls  as  one  unit  of  time 

3.  if  3y  e  f{p)  :  \n{p)  -  y\  <  €1,  outputs  “PASS”  with  probability  >  1  -  ^ 

4.  if  Wy  e  f  lp)  :  \nlp)  -  y\  >  €2,  outputs  “FAIL”  with  probability  >1-/3. 

To  simplify  notation,  we  adopt  the  following  conventions:  (i)  if  q{n)  =  t{n)^  we 
omit  one  of  them,  (ii)  if  ei  =62,  we  omit  one  of  them,  and  (hi)  if  ei  =  €2  =  0, 
we  omit  both  from  the  checker’s  parameters. 

Note  that  the  above  model  is  more  general  than  the  standard  checking  model 
in  that  p  is  available  as  an  oracle  rather  than  in  an  explicit  form.  (It  is  often  un¬ 
realistic  or  less  efficient  to  assume  that  an  explicit  representation  of  p  is  available 
to  T.)  We  will  see  that  this  model  (i)  captures  the  library  setting  of  [BLR93] 
and  helps  us  build  efficient  checkers,  (ii)  is  useful  in  our  applications  to  check¬ 
ing  matrix  spectra  computations,  and  (hi)  elegantly  extends  our  checkers  to  the 
approximate  setting. 

Variations  of  the  Model.  Our  model  permits  the  following  variations  and 
their  combinations:  (i)  The  program  purports  to  return  an  approximation  to  /. 
In  this  case,  the  program  is  denoted  by  71.  (ii)  Each  oracle  call  to  evaluate  p 
returns  an  approximation.  In  this  case,  the  oracle  is  denoted  by  p.  and  (ih)  p  is 
“close”  to  a  polynomial  (as  in  the  POP  setting).  We  will  address  the  first  and 
second  variations.  They  make  the  problem  more  appealing  since  in  practice  we 
are  seldom  guaranteed  an  exact  answer  to  any  numerical  question.  In  this  paper, 
we  will  call  a  checker  for  the  second  scenario  an  approximate  checker. 

Self" Testing,  Self- Correcting,  Checking,  and  Libraries.  Self-testing  en¬ 
sures  that  77  equals  the  target  function  /  (from  a  function  family  F)  on  most 
inputs.  A  self-tester  usually  has  two  stages  [BLRQSj:  (i)  testing  if  77  is  a  mem¬ 
ber  of  F  (the  property  test)  and  (ii)  testing  if  77  is  the  specific  member,  i.e.,  / 
(the  equality  test).  Self-correction  involves  taking  a  77  that  is  correct  on  most 
inputs  and  converting  it  into  a  program  that  is  correct  on  all  inputs.  A  self¬ 
tester  together  with  a  self-corrector  gives  a  result-checker.  In  the  library  setting, 
a  collection  of  previously  checked  programs  is  used  to  build  checkers  for  new 
functions.  For  details  see  [BLR93]. 

Mathematical  Notation.  We  consider  polynomials  over  a  field  F.  Let  R  de¬ 
note  the  real  numbers  and  C  denote  the  complex  numbers. 

Let  Fn[x]  denote  the  ring  of  polynomials  of  degree  <  n  with  coefficients 
from  T.  Let  p{-)  be  a  degree  n  polynomial  (i.e.,  p  E  Fn[x]).  Assuming  p  factors 
completely  in  7^,  let  the  roots  ofp  be  |Ai|  >  •  •  •  >  |An|-  When  7^  =  R.,  we  callp  a 
real  polynomial  and  if  all  the  roots  of  p  are  real,  we  call  p  a  real-root  polynomial. 

For  any  a  E  C,  let  a  E  C  denote  its  complex  conjugate.  For  any  curve  (line 
segment,  interval)  C,  let  \C\  denote  its  length  and  miC  its  interior.  For  x^y  E 
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let  Ty  denote  the  line  segment  between  x  and  y.  A  convex  curve  in  is  called 
a  contour  if  it  encloses  the  origin.  A  curve  C  :  [0, 27r)  ^  is  called  star-shaped 
if  it  is  an  injective  closed  curve. 

Let  p{x)  =  nr=i(^  “  Then,  it  easily  follows  that 

g(x):  We  will  use  g(x)  to  denotep^(a?)/;?(2:)  =  X)r=i  l/{x-Xi)  =  d\n  \p{x)\/dx. 
Ainf,  Amax*  Cauchy’s  inequality  [BCL82]  gives  bounds  on  the  roots  of  p  as  Ainf 
—  |^n|/(i^n|  T  maX^_Q  <  |Amin|  ^  |AiTiax|  1  "b  niax^_]^  { |(li  |  }/|ao  |  —  Asup- 

6:  A  separation  bound  between  the  roots  of  p  is  given  by  [BCL82]  as  <5  == 
|Az-Aj|  >  ||p|p disc(p),  where  the  discriminant  disc(p) 

=  irii5ij(Ai  -  Aj)|  =  |res(p,p')|  and  ||p|p  =  P93].  Here,  resultant 

res(p,  q)  =  9(Ai)  where  \i  is  a  root  of  p.  Some  of  our  checkers  assume  that 

a  lower  bound  on  6  is  known. 

Problem  Definitions.  Let  77  be  a  program  that  purports  to  compute  one  or 
more  roots  of  p  and  let  {/^i}  be  the  value(s)  computed  by  77.  Let  {A^}  be  the 
actual  root(s)  that  77  should  have  output.  (Thus  for  instance,  TTmax,  which  pur¬ 
ports  to  compute  Amax,  outputs  Mmax  to  be  the  largest  root.)  Given  a  polynomial 
p  of  degree  n,  let: 

-  7^1  (p)  be  a  relation  mapping  p  to  any  one  of  its  roots.  We  refer  to  programs 
that  purport  to  compute  a  value  7^i(p)  as  77i. 

-  7^r(p)  be  a  relation  mapping  p  to  any  r  of  its  roots.  We  refer  to  programs 
that  purport  to  compute  a  set  7^r(p)  as  77^. 

-  'R(k){p)  be  the  /cth  largest  root  of  p  in  absolute  value  (i.e.,  Afc).  Il^k)  refers 
to  a  program  that  purports  to  compute  7^(fe). 

-  7^max  =  7l(i),7^min  =  7^(^)  and  77max,  77min  refer  to  programs  that  suppos¬ 
edly  compute  7^rnax7^min* 

In  general,  we  use  a  tilde  to  denote  programs  that  purport  to  return  approxima¬ 
tions  to  the  corresponding  exact  relation  (e.g.,  ^max)- 

3  Checking  Roots:  Exact  Setting 

Checkers  for  7^i,7lr,7^n,7lmax- 
Theorem  2.  Let  \T\  >  n -f  L2{n).  There  is: 

1.  a  {!)- checker  for 'JZi{p), 

2.  a  {l,n)-checker  for  7ln{p),  and 

3.  a  {mm{r,n  ~~  r},Yaz:yi{r,n  —  r})- checker  for  'Rr{p)- 

In  the  exact  setting,  given  /Xmax?  h  is  trivial  to  verify  that  it  is  a  root  of  p.  It  is 
non-trivial  to  verify  the  maximality  claim.  Theorem  3  below  states  a  checker  for 
77max(p)-  In  the  next  section,  we  will  show  more  efficient  checkers  (that  avoid 
explicit  interpolation)  for  77max(p)- 
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Theorems.  Ve  >  0,  there  is  an  {n,  n^;€)- checker  for  1lmin{p)-  Ifv  a  real-root 
polynomial,  Ve  >  0,  there  is  an  {n,n^;e)- checker  for  7^max(p)* 

A  checker  for  iJmax  is  constructed  from  a  checker  for  ilmin  in  an  obvious  manner 
by  observing  that  l/Xi  are  the  roots  of  x^p{l/x).  Note  that  the  checkers  given 
by  Theorem  3  can  also  be  used  to  check  77(fc). 

Improved  Checkers  for  TZmax*  ^  known.  For  the  rest  of  this  section,  we  will 
take  either  ^  =  C  or  ^  =  M.  We  use  the  following  theorems  from  complex 
analysis  (see  [Con78]).  Let  n{C;  z)  be  the  number  of  times  C  “winds”  around  the 
point  z  e  C. 

Theorem  4  Cauchy’s  Residue  Theorem.  Let  G  he  an  open  subset  of  the 
plane  and  f  :  G  C  an  analytic  function.  If  C  is  a  closed  rectifiable  (fi¬ 
nite  length)  curve  in  G  such  that  n[C\z)  =  0  Vz  €  C\(j,  then  for  A  G  G\C, 
27rz7(A)n(C;  A)  -  f{z)l{z  -  \)dz. 

Corollary  5.  Let  G  be  an  open  subset  of  the  plane  and  f  be  an  analytic  func¬ 
tion  on  G  with  zeros  Ai,...,An  (repeated  according  to  multiplicity).  If  C  is  a 
closed  rectifiable  curve  in  G  which  does  not  pass  through  any  point  \k,  and 
n(C;z)  =  0,Vz  E  C\G,  then  /cf(^)//(^)  dz  =  counts  (with 

multiplicities)  the  number  of  roots  of  f{z)  within  C. 

Theorem  6.  There  is  a  ((|Amax|/<5)^/^;  <5/2)-c/iec^er /or  7^max(;?)- 


Proof.  If  (7  is  a  circle,  then  by  Corollary  5,  f^p\z)/p{z)dz  computes  27r^  times 
the  number  of  zeros  of  p  that  are  within  C  (noting  that  C  winds  once  around 
each  root).  So,  our  goal  is  to  check  that  f^p'{z)lp{z)dz  =  27r7i,  where  C{t)  = 
(iMmaxI  +  Sl2)e^\  Recall  that  g{z)  =  p'{z)/p[z).  We  compute  an  approximation 
S'  to  9{z)dz,  which  must  satisfy  |S  -  27rn|  <  tt.  If  we  use  trapezoidal  rule, 


we  have 


!c 


<  {\Cf  lN^)m^yizec\g”{z)\,  where  a^’s  are 
constants  and  N  is  the  number  of  points  of  evaluation. 

Since  we  can  only  approximate  we  actually  end  up  computing 
{g{zi)  +  €{).  Therefore,  we  can  evaluate  the  overall  error  as 

9{z)dz  -^ai{g{zi)  ei)  <  max  |7'(^)|  +  | 
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where  ci  is  a  small  constant  and  e  =  max^  |6i|.  Our  goal  is  to  find  conditions 
under  which  {\C\yN^)  \9"iz)\  +  cie|^max|  <  tt,  such  that  rounding  always 

gives  the  correct  value.  We  first  find  a  bound  on  e  for  which  cie|/imax|  <  7r/2.  Ifp' 
is  approximated  by  finite  differences,  then  e  =  max^  |(p((  +  /i)  -p(())/(zip(())  - 
^(C)|  <  Ap'f^^^{C)/p{0,  for  some  C'  ^  (C, C  +  An  upper  bound  on  A  is 
dictated  by  these  conditions.  Now,  we  have  Pm&xiC) / P{0  =  ~  " 

-  ^fc)/(C  -  >^k)  <  Thus,  A  must  satisfy  A  < 

(c2<5^)/(n^|/Amax|),  wherc  C2  is  a  small  constant. 

The  other  error  term  (|Cp/W2)max^  \9”{z)l  can  now  be  upper  bounded  as 
{\C\^  /N^)maxz  \9”{z)\  <  (c3|/imaxH/(-A^^^),  from  which  the  number  of  evalua¬ 
tion  points  N  =  0{\p  max  |/5)V2. 
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For  real-root  polynomials,  the  number  of  oracle  calls  to  p  can  be  reduced  by 
stronger  bounds  on  max^  \g''{z)\.  The  proof  is  omitted. 

Corollary  7.  If  p  is  a  real-root  polynomial,  then  there  exists  a  {^/n\prnax\  + 
{l/Sy/^'^'^^^^\S/2)-checker  for  7^max(:p)' 

If  (5  =  0{l/n),  then  the  above  corollary  yields  asymptotically  better  checkers 
than  those  given  by  Theorems  3  and  6.  We  also  give  a  different  checker  (proof 
omitted)  that  can  be  extended  to  work  in  the  approximate  setting. 

Theorems.  There  is  a  2) -checker  for  IZraax{p)  where 

e  <  l/\p”{x)\,  'ix. 

Improved  Checkers  for  Umax*  h  unknown.  We  obtain  the  following  checkers 
for  the  case  when  <5  is  not  known.  The  proofs  of  the  theorems  are  omitted. 

Theorem  9.  Letp  be  a  real-root  polynomial  Ve,/^  >  0,  there  is  a  (v^log^^^(n/e) 
log(l//3);  e/2,e)-c/iec^er /or  IZmax{p)  that  is  correct  with  probability  >1-/3. 

The  above  theorem  can  be  extended  to  the  case  when  the  roots  are  complex.  The 
checker  is  still  attractive  in  terms  of  its  running  time,  but  has  more  evaluations 
of  p. 

Corollary  10.  Ve,/3  >  0,  there  is  an  /elog{l/ (I);  e/ 2,  e)- checker  for  TZjriax{p) 

that  is  correct  with  probability  >1-/0. 

The  checkers  in  this  section  can  be  extended  to  check 

4  Checking  Roots:  Approximate  Setting 

So  far,  we  have  been  using  the  assumption  that  the  programs  being  checked 
should  return  the  exact  root(s)  and  the  oracle  returns  the  exact  value.  As  we 
stated  in  the  description  of  our  model,  we  have  two  variants  -  IT  and  p.  The 
former  turns  out  to  be  easier  than  the  latter. 

Case  I:  JT.  Suppose  -^i(p)  returns  an  e-approximation  (i.e.,  it  claims  \iii  —  Ai|  < 
e  <  6).  When  p  is  real-root,  with  two  oracle  calls  to  p  we  can  check  if  there  is 
a  sign-change  in  [pi  —  e,pi  T  e].  For  Unip)  (resp.  i7r.(p)),  we  can  extend  the 
above  checker  with  2n  (resp.  2r)  calls  to  p.  Since  we  do  not  have  a  nice  analog 
of  Rolle’s  theorem  in  complex  analysis,  the  problem  becomes  harder  when  p  is 
not  real-root. 

All  our  checkers  for  Umax  in  Section  3  can  be  extended  to  this  approximate 
setting.  This  can  be  done  as  follows:  (i)  first  we  check  if  pmax  is  an  approximate 
root  and  (ii)  then  we  check  if  it  is  indeed  the  maximum  root.  The  former  is 
accomplished  by  checking  if  there  is  a  root  inside  a  small  circle  around  Pmax 
(see  previous  paragraph)  and  the  latter  is  accomplished  by  selecting  two  curves 
separated  by  e  and  then  performing  the  numerical  integration  twice.  Thus,  we 
obtain  the  following  theorem: 
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Theorem  11.  Ve,/3  >  0,  there  is  (i)  a  {{\fj,rn^y,\lSf^^;S/2)-checker  for 
and  (ii)  an  /e\og{l/ p);€/ 2,  e)- checker  for  that  is  correct  with  prob¬ 

ability  >  1-/3.  Ifp  is  a  real-root  polynomial,  Ve,/3  >  0^  there  is  a  (i)  (\/n|/^max|  + 
6/2)-checkerfor  7^xnax(p),  (n)^  (v^log^/^(n/e)  log(l//3);  e/2,  e)-checker 
for  ^max(p)  lhat  is  correct  with  probability  >1-/3. 

Case  II:  p.  Theorem  8  can  be  extended  to  the  case  when  we  have  only  p  (i.e., 
the  evaluation  of^>  is  approximate).  The  proof  is  omitted. 

Corollary  12.  There  is  a  ((Aj^p  -  checker 

for  lii{p). 

The  checkers  in  Section  3  are  not  directly  usable  in  this  case  because  of  the 
instability  of  numerical  derivative  computation  in  the  presence  of  errors. 

5  Testing  Norms 

A  function  /  :  V{T)  where  V  is  a  vector  space  over  .F,  is  called  a  norm 

if  it  satisfies:  (i)  /(x)  =  0  x  =  0,  (ii)  Vx  €  V, G  J^,f{kx.)  =  kf{x) 

(scalability),  and  (hi)  Vx,y  6  V,/(x  +  y)  <  /(x)  +  /(y)  (triangle  inequality). 

In  this  section  we  investigate  the  problem  of  checking  whether  the  function 
computed  by  a  program  iJnorm  is  close  to  a  norm  (i.e.,  there  is  a  norm  that  agrees 
with  ZTnorm  On  luost  inputs).  In  the  specific  case  of  vector  p-norms  on  which 
are  of  the  form  |x|p  =  (XI^zzi  the  problem  reduces  to  the  well-studied 

problem  of  multivariate  degree-testing  [AS92,  RS96].  In  fact,  matrix  spectral 
norms  can  be  checked  using  our  techniques  in  Section  3  and  Section  4.  In  the 
more  general  case  of  checking  whether  the  function  is  close  to  any  norm,  we 
show  that  the  properties  characterizing  a  norm  are  not  usable  for  exact  self¬ 
testing.  This  result  is  already  interesting  in  that  our  tests  are  almost  exactly 
the  same  as  the  standard  linearity  test  except  for  an  inequality  in  the  second 
test.  This,  however,  makes  a  big  difference  in  the  validity  of  the  test,  which  leads 
us  to  believe  that  inequalities  in  general  do  not  lead  to  (exact)  self-testers.  In  a 
striking  contrast,  we  show  that  these  properties  characterizing  norms  can  lead  to 
approximate  self-testers.  The  following  discussion  is  for  R^  and  can  be  extended 
to 

Exact  Testing.  To  check  scalability  of  ilnorm,  note  that  along  a  vector  x, 
scalability  defines  the  same  set  of  functions  as  linearity.  Checking  i7norm(^^x)  -|- 
^norm(&x)  -  77norm((n  +  6)x)  for  X,  |x|  =  1  will  determine  if  i7norm  is  scalable 
along  X  (this  is  the  linearity  test  of  [BLR93]).  By  performing  this  test  at  many 
X,  we  can  ensure  that  /Inorm  is  scalable  for  many  x.  Therefore,  for  the  rest 
of  this  discussion,  we  can  assume  that  ilnorm  is  scalable.  Vi  €  M,  define  the 
“concentric”  contours  Ci  =  {x  |  ilnorm(x)  =  i}.  We  first  show  that  checking 
the  triangle  inequality  is  equivalent  to  checking  the  convexity  of  Ci  in  for  any 
i  e  E. 
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Lemma  13.  Let  f  be  a  scalable  function,  i.e.,  /(/cx)  =  kf{x).  Then,  3a, b  G  V, 
such  that  /(a  +  b)  >  /(a)  +  /(b)  <1=^  Vi,  the  i-th  contour  Ci  is  not  convex  (the 
non- convexity  occurs  along  a  +  b, j 

We  show  in  Theorem  14  that  random  sampling  of  condition  3  does  not  work: 
there  are  extremely  “bad”  programs  that  pass  it. 

Theorem  14.  VO  <  ^  <  1,  there  exists  a  scalable  iTnorm  that  is  at  least  6  away 
from,  the  nearest  convex  function  g,  i.e.,  Pra;[i7norm(2^)  9i^)]  ^  ^'^t  ilnorm 

gasses  the  test  for  condition  3  with  arbitrarily  high  probability. 

Approximate  Testing.  In  contrast  to  exact  testing,  we  show  that  the  proper¬ 
ties  characterizing  norms  can  be  used  to  test  if  a  program  approximately  com¬ 
putes  a  norm  at  a  non-trivial  fraction  of  the  inputs. 

For  a  given  star-shaped  C,  let  the  diameter  be  diam  C  =  supi2,<2{l^(^i)  “ 
C{t2)\}.  Given  two  curves  €1,62,  let  the  distance  between  them  be  \Ci  -  C2I  - 
supt{|Ci(t)  -  C2{t)\}.  For  two  contours  Ci,C2  and  for  any  other  star-shaped  C, 
let  the  deviation  measure  be  devci.Cal^)  =  Prt[C(t)  >  Ci(t),C2(t)  ot  C{t)  < 
Ci{t),C2{t)].  This  measures  the  fraction  of  C  not  lying  between  Ci  and  C2.  For  a 
star-shaped  C,  let  A  =  A{C)  =  ^  C  UintC].  In  other  words,  A 

is  the  probability  that,  if  we  pick  random  s,t  e  C  and  a  random  point  2:  on  the 
line  joining  them,  then  z  lies  outside  C  .  Testing  condition  3  on  random  x,y,  we 
can  estimate  A  corresponding  to  the  contour  defined  by  ilnorm  (assuming  it  is 
star-shaped,  which  is  easy  to  check). 

Theorem  15.  Given  p  >  0,  3e  =  e{p)  <  1,7  —  7(e)  >  0  such  tlmt  for  any 
star- shaped  C  with  diam  C  <  1,  if  A{C)  <  ^  then  there  is  a  contour  C  such  that 
TTt[\C{t)-Cit)\>  p]<€. 

6  Some  Applications:  Matrix  Computations 

In  this  section,  we  show  applications  of  our  checkers  for  polynomial  roots  to 
matrix  spectra  computations.  Let  the  eigenvalues  of  A  €  jmxn  | 

l<i<n}  with  I  A^ax  =  Ai  |  >  •  •  •  >  |An|.  It  is  easy  to  find  an  upper  bound  Asup 
on  Amax  (e.g.,  set  Asup  =  ||A||oo).  We  denote  by  (5  a  separation  bound  between 
the  eigenvalues  of  A.  Let  Det  be  a  correct  program  available  in  the  library  for 
computing  the  determinant  of  a  matrix.  Det  corresponds  to  the  oracle  p  in  our 
model. 

Eigenvalues  in  the  Exact  Setting.  All  the  checkers  in  Section  3  and  Section 
4  translate  to  checkers  for  eigenvalues.  We  now  illustrate  more  efficient  checkers 
for  some  special  cases  which  are  of  interest  in  practice: 

Lemma  16.  Let  A  e  with  A  =  A^.  There  is  an  {n)-checker  for  program 

computing  the  largest  eigenvalue  of  A.  If  A  is  tridiagonal,  there  is  an  [n)-checker 
for  a  program  computing  the  k-th  largest  eigenvalue  of  A. 
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These  checkers  can  be  used  to  check  programs  designed  for  computing  the  second 
largest  eigenvalue  of  a  regular  graph  (or  the  largest  eigenvalue  of  its  Laplacian), 
which  is  related  to  the  expansion  of  the  graph.  Another  natural  application  of 
these  checkers  is  to  check  programs  that  decide  whether  a  matrix  is  positive 
definite. 

Eigenvalues  in  the  Approximate  Setting.  All  of  our  approximate  checkers 
for  roots  can  be  used  in  this  case.  We  consider  an  interesting  special  case  of  this 
problem.  Let  (A,x)  be  an  exact  eigenvalue-eigenvector  pair  of  A  with  ||a?||  ^  1. 
Let  ill  be  a  relation  that  binds  matrix  A  to  pairs  {/a,  x)  with  |/a  -  A|  <  ei  and 
||x  -  x||  <  €2.  Let  III  be  a  program  that,  on  input  A,  purports  to  compute  a 
(^,  x)  €  1^1  (A).  The  most  natural  way  of  checking  l^i  would  be  by  checking  that 
II  Ax  -  ^x||  <  e,  for  a  certain  threshold  e,  and  passing  IIi  if  the  above  inequality 
is  satisfied.  Unfortunately,  from  perturbation  theory  [GV89]  we  have  that  the 
value  of  e  above  can  be  small,  but  \ii  -  A|  be  as  large  as  e/|y^x|,  where  is 
a  unit  length  left  eigenvector  of  A  (y^A  =  Ay'^),  assuming  A  to  be  a  simple 
eigenvalue.  Thus,  we  might  need  to  set  e  to  a  very  small  value  if  we  want  to 
make  sure  that  |/z  -  A|  <  €1  for  a  reasonably  small  value  61.  Note  however  that 
for  normal  matrices  ,  |y^x|  =  1  so  that  || A/x  -  2/x||  <  e  ^  |A  -  <  e.  Thus,  we 

have  the  following  lemma: 

Lemma  17.  7^i(A)  can  be  approximately  checked  when  A  is  normal 

In  general,  if  we  do  not  make  assumptions  on  the  problem  condition,  the  ap¬ 
proximate  checker  may  yield  very  poor  bounds.  This  is  because  the  determinant 
of  a  matrix  can  be  very  close  to  zero  (e.g.,  1/2^)  despite  all  eigenvalues  being 
well-separated  from  zero  (e.g.,  Xi  =  1/2). 

Singular  Values.  Suppose  Mult  is  a  correct  library  program  for  matrix  mul¬ 
tiplication.  If  JTsing  is  a  program  that  purports  to  compute  the  singular  values 
(Ji, ...  ,crn  of  A,  construct  a  checker  for  Using  as  follows:  (i)  check  if  ai  >  0, 1  < 
i  <  n,  (ii)  compute  the  matrix  A^A  6  using  Mult,  and  (iii)  use  the 

checkers  for  eigenvalues  to  verify  if  {<Ji, . . .  =  yl(A^A).  The  correctness  of 

this  construction  is  from  the  definition  of  singular  values  (see  [GV89]). 


7  Further  Work 

All  of  our  checkers  are  assumed  to  perform  exact  arithmetic.  This  assumption 
is  not  always  true  in  practice.  It  will  be  interesting  to  design  checkers  when  the 
checker’s  numerical  errors  are  critical.  Many  issues  are  still  unresolved  in  the 
case  of  p.  Are  there  efficient  checkers  for  programs  that  compute  Grobner  bases, 
programs  that  solve  Diophantine  problems  and  lattice  problems?  Such  checkers 
would  find  numerous  applications  in  computational  algebra  systems.  Can  we 
get  efficient  checkers  for  sparse-matrix  computations? 
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Abstract.  In  1 876,  Lewis  Carroll  proposed  a  voting  system  in  which  the  winner 
is  the  candidate  who  with  the  fewest  changes  in  voters’  preferences  becomes 
a  Condorcet  winner— a  candidate  who  beats  all  other  candidates  in  pairwise 
majority-rule  elections.  Bartholdi,  To vey,  and  Trick  provided  a  lower  bound — NP- 
hardness — on  the  computational  complexity  of  determining  the  election  winner 
in  Carroll’s  system.  We  provide  a  stronger  lower  bound  and  an  upper  bound 
that  matches  our  lower  bound.  In  particular,  determining  the  winner  in  Carroll’s 
system  is  complete  for  parallel  access  to  NP,  i.e.,  it  is  complete  for  Q\ ,  for  which  it 
becomes  the  most  natural  complete  problem  known.  It  follows  that  determining  the 
winner  in  Carroll’s  elections  is  not  NP-complete  unless  the  polynomial  hierarchy 
collapses. 


1  Introduction 

The  Condorcet  criterion  is  that  an  election  is  won  by  any  candidate  who  defeats 
all  others  in  pairwise  majority-rule  elections  ([Con85],  see  [Bla58]).  The  Condorcet 
Paradox,  dating  from  1785  [Con85],  notes  that  not  only  is  it  not  always  the  case  that 
Condorcet  winners  exist  but,  far  worse,  when  there  are  more  than  two  candidates, 
pairwise  majority-rule  elections  may  yield  strict  cycles  in  the  aggregate  preference  even 
if  each  voter  has  non-cyclic  preferences."^  This  is  a  widely  discussed  and  troubling 
feature  of  majority  rule  (see,  e.g.,  the  discussion  in  [Mue89]). 

In  1 876,  Charles  Lutwidge  Dodgson — more  commonly  referred  to  today  by  his  pen 
name,  Lewis  Carroll — proposed  an  election  system  that  is  inspired  by  the  Condorcet 

*  A  full  version  of  this  paper,  including  all  proofs,  can  be  found  at  http://www.cs.rochester.edu/trs 
as  UR-CS-TR-96-640.  Supported  in  part  by  grants  NSF-CCR-9322513and  NSF-INT-95I3368/ 
DA  AD-3 1 5-PRO-fo-ab,  and  a  University  of  Rochester  Bridging  Fellowship. 

**  edith@bainboo.lemoyne.edu.  Work  done  in  part  while  visiting  Friedrich-Schiller- 
Universitat  Jena  and  the  University  of  Amsterdam. 

***  lane@cs.rochester.edu.  Workdoneinpart  while  visiting  Friedrich-Schiller-Universitat 
Jena  and  the  University  of  Amsterdam. 

^  rothe@inf  orma  t  i k .  un i -  j  ena .  de .  Work  done  in  part  while  visiting  Le  Moyne  College. 
The  standard  example  is  an  election  over  candidates  a,  b,  and  c  in  which  1/3  of  the  voters  have 
preference  {a  <  b  <  c),  1/3  of  the  voters  have  preference  {b  <  c  <  a),  and  1/3  of  the  voters 
have  preference  (c  <  a  <  6).  In  this  case,  though  each  voter  individually  has  well-ordered 
preferences,  the  aggregate  preference  of  the  electorate  is  that  b  trounces  a,  c  trounces  6,  and 
a  trounces  c.  In  short,  individually  well-ordered  preferences  do  not  necessarily  aggregate  to  a 
well-ordered  societal  preference. 
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criterion, yet  that  sidesteps  the  abovementioned  problem  [Dod76].  In  particular,  a 
Condorcet  winner  is  a  candidate  who  defeats  each  other  candidate  in  pairwise  majority- 
rule  elections.  In  Carroll’s  system,  an  election  is  won  by  the  candidate  who  is  “closest” 
to  being  a  Condorcet  winner.  In  particular,  each  candidate  is  given  a  score  that  is  the 
smallest  number  of  exchanges  of  adjacent  preferences  in  the  voters’  preference  orders 
needed  to  make  the  candidate  a  Condorcet  winner  with  respect  to  the  resulting  preference 
orders.  Whatever  candidate  (or  candidates,  in  the  case  of  a  tie)  has  the  lowest  score  is 
the  winner.  This  system  admits  ties  but,  as  each  candidate  is  assigned  an  integer  score, 
no  strict-preference  cycles  are  possible. 

Bartholdi,  Tovey,  and  Trick,  in  their  paper  “Voting  Schemes  for  which  It  Can  Be 
Difficult  to  Tell  Who  Won  the  Election”  [BTT89],  raise  a  difficulty  regarding  Car- 
roll’s  election  system.  Though  the  notion  of  winner(s)  in  Carroll’s  election  system  is 
mathematically  well-defined,  Bartholdi  et  al.  raise  the  issue  of  what  the  computational 
complexity  is  of  determining  who  is  the  winner.  Though  most  natural  election  schemes 
admit  obvious  polynomial-time  algorithms  for  determining  who  won,  in  sharp  contrast 
Bartholdi  et  al.  prove  that  Carroll’s  election  scheme  has  the  disturbing  property  that  it 
is  NP-hard  to  determine  whether  a  given  candidate  has  won  a  given  election  (a  prob¬ 
lem  they  dub  Carrol IWinner — they  use  the  name  “Dodgson”  throughout,  but  we 
treat  this  as  if  they  had  written  the  equivalent  “Carroll”),  and  that  it  is  NP-hard  even 
to  determine  whether  a  given  candidate  has  tied-or-defeated  another  given  candidate  (a 
problem  they  dub  Carrol iRanking). 

Bartholdi,  Tovey,  and  Trick’s  NP-hardness  results  establish  lower  bounds  for  the 
complexity  of  Carrol  IRanking  and  Carrol  IWinner.  We  optimally  improve  their 
two  complexity  lower  bounds  by  proving  that  both  problems  are  hard  for  the  class 
of  problems  that  can  be  solved  via  parallel  access  to  NP,  and  we  provide  matching 
upper  bounds.  Thus,  we  establish  that  both  problems  are  G^-complete.  Bartholdi  et 
al.  explicitly  leave  open  the  issue  of  whether  Carrol  IRanking  is  NP-complete: 
“...Thus  Carrol  IRanking  is  as  hard  as  an  NP-complete  problem,  but  since  we  do  not 
know  whether  Carrol  IRanking  is  in  NP,  we  can  say  only  that  it  is  NP-hard”  [BTT89, 
p.  161].  From  our  optimal  lower  bounds,  it  follows  that  neither  Carrol  IWinner  nor 
Carrol  IRanking  is  NP-complete  unless  the  polynomial  hierarchy  collapses. 

As  to  our  proof  method,  in  order  to  raise  the  known  lower  bound  on  the  complexity  of 
Carroll  elections,  we  first  study  the  ways  in  which  feasible  algorithms  can  control  Carroll 
elections.  In  particular,  we  establish  a  series  of  lemmas  showing  how  polynomial-time 
algorithms  can  control  oddness  and  evenness  of  election  scores,  “sum”  over  election 
scores,  and  merge  elections.  These  lemmas  then  lead  to  our  hardness  results. 

We  remark  that  it  is  somewhat  curious  finding  “parallel  access  to  NP”-compIete 
(i.e.,  09 -complete)  problems  that  were  introduced  almost  one  hundred  years  before 
complexity  theory  itself  existed.  In  addition,  Carrol  IWinner,  which  we  prove  com¬ 
plete  for  this  class,  is  extremely  natural  when  compared  with  previously  known  com¬ 
plete  problems  for  this  class,  essentially  all  of  which  have  quite  convoluted  forms, 
e.g.,  asking  whether  a  given  list  of  boolean  formulas  has  the  property  that  the  number 
of  formulas  in  the  list  that  are  satisfiable  is  itself  an  odd  number  (see  the  discussion 

^  Carroll  did  not  use  this  term.  Indeed,  Black  has  shown  that  Carroll  “almost  beyond  a  doubt” 
was  unfamiliar  with  Condorcet’s  work  [Bla58,  p.  193-194]. 
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in  [Wag87]).  In  contrast,  the  class  NP,  which  is  contained  in  (9^,  has  countless  natu¬ 
ral  complete  problems.  Also,  we  mention  that  Papadimitriou  [Pap84]  has  shown  that 
UniqueOptimalTravelingSalesperson  is  complete  for  P^^,  which  contains 

2  Preliminaries 

In  this  section,  we  introduce  some  standard  concepts  and  notations  from  com¬ 
putational  complexity  theory  [Pap94,BC93].  NP  is  the  class  of  languages  solvable 
in  nondeterministic  polynomial  time.  The  polynomial  hierarchy,  PH,  is  defined  as 
PH  =  P  U  NP  U  NP'^^  U  NP^^  U  •  •  •  where,  for  any  class  C,  NP^  =  Ucec  NP^, 
and  NP^*  is  the  class  of  all  languages  that  can  be  accepted  by  some  NP  machine  that  is 
given  a  black  box  that  in  unit  time  answers  membership  queries  to  C.  The  polynomial 
hierarchy  is  said  to  collapse  if  for  some  k  the  ^th  term  in  the  preceding  infinite  union 
equals  the  entire  infinite  union.  Computer  scientists  strongly  suspect  that  the  polynomial 
hierarchy  does  not  collapse,  though  proving  (or  disproving)  this  remains  a  major  open 
research  issue. 

The  polynomial  hierarchy  has  a  number  of  intermediate  levels.  Of  particular  interest 
to  us  will  be  the  level  ^2  ^^e  class  of  all  languages  that  can  be  solved  via 

(9  (log  n)  queries  to  some  NP  set  (see  [Wag90]).  Equivalently,  and  more  to  the  point  for 
the  purposes  of  this  paper,  equals  the  class  of  problems  that  can  be  solved  via  parallel 
access  to  NP,  as  explained  formally  below.  0^  falls  between  the  first  and  second  levels 
of  the  polynomial  hierarchy:  NP  C  0^  C  P^^  C  NP'^^.  Kadin  [Kad89]  has  proven  that 
if  NP  has  a  sparse  Turing-complete  set  then  the  polynomial  hierarchy  collapses  to  02, 
Wagner  [Wag90]  has  shown  that  the  definition  of  0\  is  extremely  robust,  and  Jenner 
and  Toran  [JT95]  have  shown  that  the  robustness  of  the  class  Q\  seems  to  fail  for  its 
function  analogs. 

Problems  are  encoded  as  languages  of  strings  over  some  fixed  alphabet  E  having 
at  least  two  letters.  denotes  the  set  of  all  strings  over  E,  For  any  string  a:  G  iJ*, 
let  |x’|  denote  the  length  of  x.  For  any  set  A  C  let  A  denote  \  A.  For  any  set 
A  C  X’*,  let  ||v4||  denote  the  cardinality  of  A.  For  any  multiset  A,  ||A||  will  denote 
the  cardinality  of  A.  For  example,  if  A  is  the  multiset  containing  one  occurrence  of 
the  preference  order  {w  <  x  <  y)  and  seventeen  occurrences  of  the  preference  order 
{w  <  y  <  x),  then  ||A||  =  18.  As  is  standard,  for  each  language  A  C  X*  we  use  xa 
to  denote  the  characteristic  function  of  A,  i.e.,  xa{^)  =  I  ii  x  £  A  and  xa{^)  =  0 
if  a:  ^  A.  Let  (•  •  •)  be  any  standard,  multi-arity,  easily  computable,  easily  invertible 
pairing  function.  We  will  also  use  the  notation  (•  •  •)  to  denote  preference  orders,  e.g., 
{iv  <  x  <  y).  Which  use  is  intended  will  be  clear  from  context. 

In  computational  complexity  theory,  reductions  are  used  to  relate  the  complexity  of 
problems.  Very  informally,  if  A  reduces  to  B  that  means  that,  given  B,  one  can  solve 
A.  For  any  a  and  b  such  that  is  a  defined  reduction  type,  and  any  complexity  class 
C,  let  R^(C)  denote  {L  \  (3C  G  C)  [L  0]}.  We  refer  readers  to  the  standard  source, 
Ladner,  Lynch,  and  Selman  [LLS75],  for  definitions  and  discussion  of  the  standard 
reductions.  However,  we  briefly  and  informally  present  to  the  reader  the  definitions 
of  the  reductions  to  be  used  in  this  paper.  A  B  (“A  polynomial-time  many- 
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one  reduces  to  B”)  if  there  is  a  polynomial-time  computable  function  /  such  that 
(V.X’  €  r")[x  e  A  f{x)  e  B].  A  <lt  B  CA  polynomial-time  truth-table 

reduces  to  5”)  if  there  is  a  polynomial-time  Turing  machine  that,  on  input  x,  computes 
a  query  that  itself  consists  of  a  list  of  strings  and,  given  that  the  machine  after  writing 
the  query  is  then  given  as  its  answer  a  list  telling  which  of  the  listed  strings  are  in 
B,  the  machine  then  correctly  determines  whether  x  is  in  A  (this  is  not  the  original 
Ladner-Lynch-Selman  definition,  as  we  have  merged  their  querying  machine  and  their 
evaluation  machine,  however  this  formulation  is  common  and  equivalent).  Since  a 
reducing  machine,  on  a  given  input,  asks  all  its  questions  in  a  parallel  (also  called 
non-adaptive)  manner,  the  informal  statement  above  that  0^  captures  the  complexity 
of  “parallel  access  to  NP”  can  now  be  expressed  formally  as  the  claim  0^  =  Rf^(NP), 
which  is  known  to  hold  [KSW87,Hem89]. 

As  has  become  the  norm,  we  always  use  hardness  to  denote  hardness  with  respect 
to  reductions.  That  is,  for  any  class  C  and  any  problem  A,  we  say  that  A  is  C-hard  if 
(VC  G  C)[C  A].  For  any  class  C  and  any  problem  A,  we  say  that  A  is  C-complete  if 
A  is  C-hard  and  A  E  C.  Completeness  results  are  the  standard  method  in  computational 
complexity  theory  of  categorizing  the  complexity  of  a  problem,  as  a  C-complete  problem 
A  is  both  in  C,  and  is  the  hardest  problem  in  C  (in  the  sense  that  every  problem  in  C  can 
be  easily  solved  using  A). 

3  The  Complexity  of  Carroll  Elections 

Lewis  Carroll’s  voting  system  ([Dod76],  see  also  [NR76,BTT89])  works  as  follows. 
Each  voter  has  strict  preferences  over  the  candidates.  Each  candidate  is  assigned  a  score, 
namely,  the  smallest  number  of  sequential  exchanges  of  two  adjacent  candidates  in  the 
voters’  preference  orders  (henceforward  called  “switches  )  needed  to  make  the  given 
candidate  a  Condorcet  winner.  We  say  that  a  candidate  c  ties-or-defeats  a  candidate  d  if 
the  score  of  d  is  not  less  than  that  of  c.  (Bartholdi  et  al.  [BTT89]  use  the  term  “defeats”  to 
denote  what  we,  for  clarity,  denote  by  ties-or-defeats;  though  the  notations  are  different, 
the  sets  being  defined  by  Bartholdi  et  al.  and  in  this  paper  are  identical.)  A  candidate 
c  is  said  to  win  the  Carroll-type  election  if  c  ties-or-defeats  all  other  candidates.  Of 
course,  due  to  ties  it  is  possible  for  two  candidates  to  tie-or-defeat  each  other,  and  so  it 
is  possible  for  more  than  one  candidate  to  be  a  winner  of  the  election. 

Recall  that  all  preferences  are  assumed  to  be  strict.  A  candidate  c  is  a  Condorcet 
winner  (with  respect  to  a  given  collection  of  voter  preferences)  if  c  defeats  (i.e.,  is 
preferred  by  strictly  more  than  half  of  the  voters)  each  other  candidate  in  pairwise 
majority-rule  elections.  Of  course,  Condorcet  winners  do  not  necessarily  exist  for  a 
given  set  of  preferences,  but  if  a  Condorcet  winner  does  exist,  it  is  unique. 

We  now  return  to  Carroll’s  scoring  notion  to  clarify  what  is  meant  by  the  sequential 
nature  of  the  switches,  and  to  clarify  by  example  that  one  switch  changes  only  one  voter’s 
preferences.  The  (Carroll)  score  of  any  Condorcet  winner  is  0.  If  a  candidate  is  not  a 
Condorcet  winner,  but  one  switch  (recall  that  a  switch  is  an  exchange  of  two  adjacent 
preferences  in  the  preference  order  of  one  voter)  would  make  the  candidate  a  Condorcet 
winner,  then  the  candidate  has  a  score  of  1 .  If  a  candidate  does  not  have  a  score  of  0  or  1 , 
but  two  switches  would  make  the  candidate  a  Condorcet  winner,  then  the  candidate  has 
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a  score  of  2.  Note  that  the  two  switches  could  both  be  in  the  same  voter’s  preferences,  or 
could  be  one  in  one  voter’s  preferences  and  one  in  another  voter’s  preferences.  Note  also 
that  switches  are  sequential.  For  example,  with  two  switches,  one  could  change  a  single 
voter’s  preferences  from  {a  <  b  <  c  <  d)  io  {c  <  a  <  b  <  d),  where  e  <  f  will  denote 
the  preference:  “/  is  strictly  preferred  to  e.”  With  two  switches,  one  could  also  change  a 
single  voter’s  preferences  from  {a  <  b  <  c  <  d)  io  {b  <  a  <  d  <  c).  With  two  switches 
(not  one),  one  could  also  change  two  voters  with  initial  preferences  of  {a  <  b  <  c  <  d) 
and  (a  <  b  <  c  <  d)  to  the  new  preferences  (b  <  a  <  c  <  d)  and  {b  <  a  <  c  <  d).  As 
noted  earlier  in  this  section,  Carroll  scores  of  3, 4,  etc.,  are  defined  analogously,  i.e.,  the 
Carroll  score  of  a  candidate  is  the  smallest  number  of  sequential  switches  needed  to  make 
the  given  candidate  a  Condorcet  winner.  (We  note  in  passing  that  Carroll  was  before  his 
time  in  more  ways  than  one.  His  definition  is  closely  related  to  an  important  concept 
that  is  now  known  in  computer  science  as  “edit-distance” — the  minimum  number  of 
operations  (from  some  specified  set  of  operations)  required  to  transform  one  string  into 
another.  Though  Carroll’s  single  “switch”  operation  is  not  the  richer  set  of  operations 
most  commonly  used  today  when  doing  string-to-string  editing  (see,  e.g.,  [SK83]),  it 
does  form  a  valid  basis  operation  for  transforming  between  permutations,  which  after 
all  are  what  preferences  are.) 

Bartholdi  et  al.  [BTT89]  define  a  number  of  decision  problems  related  to  Carroll’s 
system.  They  prove  that  given  preference  lists,  and  a  candidate,  and  a  number  k,  it  is 
NP-complete  to  determine  whether  the  candidate’s  score  is  at  most  k  in  the  election 
specified  by  the  preference  lists  (they  call  this  problem  Carroll  Score).  They  define 
the  problem  Carrol iRanking  to  be  the  problem  of  determining,  given  preference 
lists  and  the  names  of  two  voters,  c  and  d,  whether  c  ties-or-defeats  d.  They  prove  that 
this  problem  is  NP-hard.  They  also  prove  that,  given  a  candidate  and  preference  lists,  it 
is  NP-hard  to  determine  whether  the  candidate  is  a  winner  of  the  election. 

For  the  formal  definitions  of  these  three  decision  problems,  a  preference  order  is 
strict  (i.e.,  irreflexive  and  antisymmetric),  transitive,  and  complete.  Since  we  will  freely 
identify  voters  with  their  preference  orders,  and  two  different  voters  can  have  the  same 
preference  order,  we  define  a  set  of  voters  as  a  multiset  of  preference  orders. 

We  will  say  that  {C,  c,  V)  is  a  Carroll  triple  if  C  is  a  set  of  candidates,  c  is  a 
member  of  C\  and  V  is  a  multiset  of  preference  orders  on  C.  Throughout  this  paper, 
we  assume  that,  as  inputs,  multisets  are  coded  as  lists,  i.e.,  if  there  are  m  voters  in  the 
voter  set  then  V  —  (Pj ,  P2,  •  •  • ,  Pm),  where  Pi  is  the  preference  order  of  the  ^th  voter. 
Score{{C,  c,  1^))  will  denote  the  Carroll  score  of  c  in  the  vote  specified  by  C  and  V. 

Decision  Problem:  CarrollScore 

Instance:  A  Carroll  triple  (C,  c,  V);  a  positive  integer  k. 

Question:  Is  Score{{C^  the  Carroll  score  of  candidate  c  in  the  election  specified 

by  {C,  V),  less  than  or  equal  to  Ar? 


Decision  Problem:  Carrol  IRanking 

Instance:  A  set  of  candidates  C\  two  distinguished  members  of  C,  c  and  d\  a  multiset 
V  of  preference  orders  on  C  (encoded  as  a  list,  as  discussed  above). 
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Question:  Does  c  tie-or-defeat  d  in  the  election?  That  is,  is  Score({C,  c,  V))  < 
Score((C,d,V})? 

Decision  Problem:  CarrollWinner 
Instance:  A  Carroll  triple  (C,  c,  V). 

Question:  Is  c  a  winner  of  the  election?  That  is,  does  c  tie-or-defeat  all  other  candidates 
in  the  election? 

We  now  state  the  complexity  of  Carrol  IRanking. 

Theorem  1.  Carrol  IRanking  is  O^-complete. 

It  follows  immediately — since  (a)  02  =  NP  =>  PH  =  NP,  and  (b)  R{^^{NP)  — 

NP _ that  Carrol  IRanking,  though  known  to  be  NP-hard  [BTT89],  cannot  be  NP- 

complete  unless  the  polynomial  hierarchy  collapses  quite  dramatically. 

Corollary  2.  //CarrollRanking  is  NF-complete,  then  PH  =  NP. 

Wagner  has  provided  a  useful  tool  for  proving  0^ -hardness,  and  we  state  his  result 
below  a's  Lemma  3.  However,  to  be  able  to  exploit  this  tool  we  must  explore  the  structure 
of  Carroll  elections.  In  particular,  we  have  to  learn  how  to  control  oddness  and  evenness 
of  election  scores,  how  to  add  election  scores,  and  how  to  merge  elections.  We  do  so  as 
Lemmas  4,  5,  and  7,  respectively.  On  our  way  towards  establishing  Theorem  1,  using 
Lemmas  s!  4,  and  5  we  will  first  establish  -hardness  of  a  special  problem  that  is 
closely  related  to  CarrollRanking.  This  result  is  stated  as  Lemma  6  below.  It  is 
not  hard  to  prove  Theorem  1  using  Lemma  6  and  Lemma  7.  Note  that  Lemma  7  gives 
more  than  is  needed  merely  to  establish  Theorem  1 .  In  fact,  the  way  this  lemma  is  stated 
even  suffices  to  provide — jointly  with  Lemma  6 — a  direct  proof  of  the  02  "hardness  of 
CarrollWinner. 

Lemma  3.  [Wag87]  Let  A  be  some  ]<\P-complete  set,  and  let  B  be  any  set.  If  there 
exists  a  polynomial-time  computable  function  g  such  that,  for  all  k  >  1  and  all  strings 
.ri , . . . ,  X2k  ^  satisfying  xa(^'])  >  Xa(^2)  >  ■  >  XA{^2k)r  it  holds  that 

||{i  1  G  A}||  is  odd  <1=^  g{x\ , . . . ,  X2k)  G  B, 


then  B  is  S^^-h^fd- 

Lemma  4.  There  exists  an  NP-complete  set  A  and  a  polynomial-time  computable  func¬ 
tion  f  that  reduces  A  to  CarrollScore  in  such  a  way  that,  for  every  x  G  B*, 
f(x)  -  ({C,c,V),k)  is  an  instance  <9/ CarrollScore  with  an  odd  number 
of  voters  and  (I)  if  x  G  A  then  Score{{C,  c,V))  —  k,  and  (2)  if  x  ^  A  then 
Scorei(C,c,V))  =  k  + 

Proof  of  Lemma  4.  Bartholdi  et  al.  [BTT89]  prove  the  NP-hardness  of 
CarrollScore  by  reducing  ExactCoverByThreeSets  to  it.  However, 
their  reduction  doesn’t  have  the  additional  properties  that  we  need  in  this 
lemma.  We  will  construct  a  reduction  from  the  standard  NP-complete  problem 
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ThreeDimensionalMatching  (3DM)  to  Carrol IScore  that  does  have  the  ad¬ 
ditional  properties  we  need.  Let  us  first  give  the  definition  of  3DM: 

Decision  Problem:  ThreeDimensionalMatching  (3DM) 

Instance:  Sets  M,  W,  X,  and  Y,  where  M  C  W  x  X  x  Y  and  W,  X,  and  Y  are 
disjoint,  nonempty  sets  having  the  same  number  of  elements. 

Question:  Does  M  contain  a  matching,  i.e.,  a  subset  M'  C  M  such  that  \  \M'\ |  =  \\W\\ 
and  no  two  elements  of  M'  agree  in  any  coordinate? 

We  now  describe  a  polynomial-time  reduction  /  (from  3DM  to  CarrollScore) 
having  the  desired  properties.  Our  reduction  is  defined  by  f(x)  =  where  f 

and  f"  are  as  described  below.  Informally,  /"  turns  all  inputs  into  a  standard  format 
(instances  of  3DM  having  ||M||  >  1),  and  /'  assumes  its  input  has  this  format  and 
implements  the  actual  reduction. 

Let  f'  be  a  polynomial-time  function  that  has  the  following  properties. 

1.  If  snot  an  in  stance  of  3  DM  or  is  an  instance  of  3  DM  having  II  Af  1 1  <  1,  then /"(a?) 

will  output  an  instance  of  3 DM  for  which  ||M||  >  1  and,  furthermore,  it  will  hold 
that  y  e  3DM  x  6  3DM. 

2.  If  ;r  is  an  instance  of  3DM  having  ||M||  >  1,  then  f'{x)  =  x. 

It  is  clear  that  such  functions  exist.  In  particular,  for  concreteness,  let  f'{x) 
be  if  x  is  not  an  instance  of  3  DM  or 

both  X  ^  3DM  and  x  is  an  instance  of  3 DM  having  ||M||  <  1;  let  f'{x)  be 
{{(d,e,p),{d',e',p^)},{d,d'},{e,e'},{p,p'})  if  x  is  an  instance  of  3DM  having 
||M||  <  1  and  such  thatar  G  3DM;  let  f"{x)  be  a:  otherwise. 

We  now  describe  f.  Let  x  be  our  input.  If  x  is  not  an  instance  of  3 DM  for  which 
||Af  II  >  1  then  f{x)  =  0;  this  is  just  for  definiteness,  as  due  to  f',  the  only  actions 
of  f  that  matter  are  when  the  input  is  an  instance  of  3DM  for  which  ||Af  ||  >  I.  So, 
suppose  X  =  (M,  W,  X,  Y)  is  an  instance  of  3 DM  for  which  ||M||  >  1.  Let  g  =  ||M/||. 
Define  f'{(M,  W,  X,  Y))  ({C,  c,  V),  3q)  as  follows;  Let  c,  s,  and  t  be  elements  not 

in  W  U  X  U  Y.  Let  C  —  W  U  X  \JY  \J  {c^  and  let  V  consist  of  the  following  two 
subparts: 

1 .  Voters  simulating  elements  of  M .  Suppose  the  elements  of  M  are  enumerated  as 
{[wi,Xi,yi)\\  <i<  1 1 M 1 1 } .  (The  Wi  are  not  intended  to  be  an  enumeration  of  W. 
Rather,  they  take  on  values  from  W  as  specified  by  M.  In  particular,  wj  may  equal 
■Wk  even  if  j  k.  The  analogous  comments  apply  to  the  Xi  and  yi  variables.)  For 
every  triple  [wi,  2;^ ,  in  M,  we  will  create  a  voter.  If  i  is  odd,  we  create  the  voter 
{s  <  c  <  lUi  <  Xi  <  yi  <t  <  •  •  •),  where  the  elements  after  t  are  the  elements  of 
(7  \  {5,  c,  Xi ,  yi ,  in  arbitrary  order.  If  i  is  even,  we  do  the  same,  except  that  we 
exchange  s  and  t.  That  is,  we  create  the  voter  (t  <  c  <  Wi  <  Xi  <  yi  <  s  <  ■  ■  •), 
where  the  elements  after  s  are  the  elements  of  C  \  {s,  c,  Wi,  Xi,  yi,t}  in  arbitrary 
order. 

2.  \\M\\-  \  voters  who  prefer  c  to  all  other  candidates. 
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We  will  now  show  that  f  has  the  desired  properties.  It  is  immediately  clear  that  f" 
and  f ,  and  thus  /,  are  polynomial-time  computable.  It  is  also  clear  from  our  construction 
that,  for  each  x,  f(x)  is  an  instance  ofCarroll Score  having  an  odd  number  of  voters 
since,  for  every  instance  {M,  W,  X,  Y)  of  3DM  with  ||M||  >  1,  f'((M,  14^,  X,  Y})  is 
an  instance  of  CarrollScore  with  ||M||  4-  II  ~  1)  voters,  and  since  /  always 
outputs  instances  of  this  form.  It  remains  to  show  that,  for  every  instance  (M,W,X,Y} 
of  3DM  with  ||M||  >  1: 

(a)  if  M  contains  a  matching,  then  Score((C,  c,  V})  =  3g,  and 

(b)  if  M  does  not  contain  a  matching,  then  Score((C,c,  1/))  =  3^+1. 

Note  that  if  we  prove  this,  it  is  clear  that  /  has  the  properties  (1)  and  (2)  of  Lemma  4,  in 
light  of  the  properties  of  f" .  Note  that,  recalling  that  we  may  now  assume  that  1 1 M  [  |  >  1 , 
by  construction  c  is  preferred  to  s  and  t  by  more  than  half  of  the  voters,  and  is  preferred 
to  all  other  candidates  by  ||Ml|  —  1  of  the  2||M||  —  1  voters. 

Now  suppose  that  M  contains  a  matching  Then  j|M^||  =  q,  and  every  el¬ 
ement  \n  W  U  X  UY  occurs  in  M' .  3q  switches  turn  c  into  a  Condorcet  winner 
as  follows.  For  every  element  (wi,Xi,yi)  €  M',  switch  c  upwards  3  times  in  the 
voter  corresponding  to  (wi,  Xi^i/i).  For  example,  if  i  is  odd,  this  voter  changes  from 
{s  <  c<  Wi  <  Xi  <  yi  <t  <  ■  ■  )  io  {s  <  Wi  <  Xi  <  yi  <  c  <  t  <  ■  ■  •).  Let  z  be  an 
arbitrary  element  of  W  U  ^  U  Y .  Since  z  occurs  in  M' ,  c  has  gained  one  vote  over  z. 
Thus,  c  is  preferred  to  z  by  ||M||  of  the  2llM||  -  1  voters.  Since  z  was  arbitrary,  c  is  a 
Condorcet  winner. 

On  the  other  hand,  c’s  Carroll  score  can  never  be  less  than  3q,  because  to  turn  c  into 
a  Condorcet  winner,  c  needs  to  gain  one  vote  over  z  for  every  z  C  FF  U  U  Y .  Since  c 
can  gain  only  one  vote  over  one  candidate  for  each  switch,  we  need  at  least  3q  switches 
to  turn  c  into  a  Condorcet  winner.  This  proves  condition  (a). 

To  prove  condition  (b),  first  note  that  there  is  a  “trivial”  way  to  turn  c  into  a  Condorcet 
winner  with  3q  +  1  switches:  Just  switch  c  to  the  top  of  the  preference  order  of  the  first 
voter.  The  first  voter  was  of  the  form  (s  <  c  <  W]  <  xi  <  y\  <t  <  -  ),  where  the 
elements  after  i  are  exactly  all  elements  in  FF  U  X  U  Y  \  {w\ ,  a?] ,  yi },  in  arbitrary  order. 
Switching  c  upwards  3q  +  1  times  moves  c  to  the  top  of  the  preference  order  for  this 
voter,  and  gains  one  vote  for  c  over  all  candidates  inW  U  X  UY,  which  turns  c  into  a 
Condorcet  winner.  This  shows  that  Score(C,  c,V)  <3q-\- I,  regardless  of  whether  M 
has  a  matching  or  not. 

Finally,  note  that  a  Carroll  score  of  3q  implies  that  M  has  a  matching.  As  before, 
every  switch  has  to  involve  c  and  an  element  of  FF  U  A  U  Y,  (This  is  because  c  must  gain 
a  vote  over  3q  other  candidates — FF  U  Z  U  Y — and  so  any  switch  involving  s  or  t  would 
ensure  that  at  most  3g  -  1  switches  were  available  for  gaining  against  the  3q  members 
of  W U  X  U  Y,  thus  ensuring  failure.)  Thus,  for  every  voter,  c  switches  at  most  three 
times  to  become  a  Condorcet  winner.  Since  c  has  to  gain  one  vote  in  particular  over 
each  element  in  Y,  and  to  “reach”  an  element  in  Y  it  must  hold  that  c  first  switches  over 
the  elements  of  FF  and  X  that  due  to  our  construction  fall  between  it  and  the  nearest 
y  element  (among  the  jjM||  voters  simulating  elements  of  M — it  is  clear  that  if  any 
switch  involves  at  least  one  of  the  ||M||  -  1  dummy  voters  this  could  never  lead  to  a 
Carroll  score  of  3q  for  c),  it  must  be  the  case  that  c  switches  upwards  exactly  three  times 
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for  exactly  q  voters  corresponding  to  elements  of  M .  This  implies  that  the  q  elements  of 
M  that  conespond  to  these  q  voters  form  a  matching,  thus  proving  condition  (b).  | 

Lemma  5.  There  exists  a  polynomial-time  computable  function  CarrollSum  such  that, 
for  all  k  and  for  all  {C\,c\,V\),  {Ci,C2,Vf),  . . {Ck,Ck,Vk)  satisfying  (Vi)[||yj|| 
is  odd],  it  holds  that  CarrollSum({  (C] ,  C] ,  V^) ,  {C2,  C2,  V2),  . . .  ,  {Ck,  Ck,  T4) ))  is  a 
Carroll  triple  having  an  odd  number  of  voters  and  such  that  ^  •  Score({Cj ,  Cj ,  V^-))  = 
Score(CarrollSum({{C\,c\,V\) ,  (C2,C2,  V2),  ,  (C/^,  c^,  14) ) ) ). 

Lemma  3,  Lemma  4,  and  Lemma  5  together  establish  the  0^-hardness  of  a  spe¬ 
cial  problem  that  is  closely  related  to  the  problems  that  we  are  interested  in, 
Carrol IRanking  and  Carrol iWinner.  Let  us  define  the  decision  problem 
TwoElectionRanking  (2ER). 

Decision  Problem:  TwoElectionRanking  (2ER) 

Instance:  A  pair  of  Carroll  triples  ((C,  c,  V),  {D,  d,  W))  both  having  an  odd  number 
of  voters  and  such  that  cf^d. 

Question:  Is  Score{{C^  c,  L))  <  Score((D,  d,  W))l 

Lemma 6.  TwoElectionRanking  is  G^-hard. 

We  note  in  passing  that  2ER  clearly  is  in  Rfj(NP),  and  so  from  the  fact  that  G^  = 
Rf^NP),  it  is  clear  that  2ER  is  in  @2*  Thus,  in  light  of  Lemma  6,  2ER  is  @2 -complete. 
We  also  note  in  passing  that,  since  one  can  trivially  rename  candidates,  2  ER  remains  @2- 
complete  in  the  variant  in  which  “and  such  that  c  d”  is  removed  from  the  problem’s 
definition. 

In  order  to  make  the  results  obtained  so  far  applicable  to  Carrol  IRanking 
and  Carrol  IWinner,  we  need  the  following  lemma  that  tells  us  how  to  merge  two 
elections  into  a  single  election  in  a  controlled  manner. 

Lemma  7.  There  exist  polynomial-time  computable  functions  Merge  and  Merge'  such 
that,  for  all  Carroll  triples  (C,  c,  V)  and  {D,  d,  W)for  which  c  d  and  both  V  and  W 
represent  odd  numbers  of  voters,  there  exist  C  and  V  such  that 

{\)  Merge{{C,c,V),{D,d,W))  is  an  instance  of  CarrollRanking  and 
Merge' {{C^  c,  V^),  (L>,  d,  W))  is  an  instance  o/CarrollWinner, 

(ii)  Merge{{C,  c,  V),  {D,  d,  W))  =  {C,  c,  d,  V)  and 
Merge' {(C,  c,  V),  (D,  d,  W))  =  {C,  c,  V), 

(iii)  Score({C,  c,  V))  —  Score[{C,  c,  V))  +  1, 

(iv)  Score{{C,  d,V))  =  Score{{D,  d,  PL))  4-  L  and 

(v)  for  each  e  G  C  \  {c,  d},  Score{{C,  c,  I^))  <  Score{(C,  e,  V)). 
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The  results  we  now  have  established  suffice  to  prove  both  Theorem  I  above  and 
Theorem  8  below — which  states  that  Carrol iWinner  is  @2 -complete,  the  main 
result  of  this  paper.  Full  proofs  of  the  results  in  this  paper  can  be  found  in  the  full 
version  [HHR96].^ 

Theorem  8.  Carrol  IWinner  is  9^-complete. 

Corollary  9.  //Carrol IWinner  is  NF-complete,  then  PH  =  NR 
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Game  Theoretic  Analysis 
of  Call-by-Value  Computation 

KOHEI  HONDA  NOBUKO  YOSHIDA 


Abstract.  We  present  a  general  semantic  universe  of  call-by-value  computation 
based  on  elements  of  game  semantics,  and  validate  its  appropriateness  as  a  semantic 
universe  by  the  full  abstraction  result  for  call-by- value  PCF,  a  generic  typed  pro¬ 
gramming  language  with  call-by- value  evaluation.  The  key  idea  is  to  consider  the 
distinction  between  call-by-name  and  call-by-value  as  that  of  the  structure  of  in¬ 
formation  flow,  which  determines  the  basic  form  of  games.  In  this  way  call-by-name 
computation  and  call-by-value  computation  arise  as  two  independent  instances  of 
sequential  functional  computation  with  distinct  algebraic  structures.  We  elucidate 
the  type  structures  of  the  universe  following  the  standard  categorical  framework  de¬ 
veloped  in  the  context  of  domain  theory.  Mutual  relationship  between  the  presented 
category  of  games  and  the  corresponding  call-by-name  universe  is  also  clarified. 


1.  Introduction 

The  call-by-value  is  a  mode  of  calling  procedures  widely  used  in  imperative  and  functional 
programming  languages,  e.g.  [1,  30],  in  which  one  evaluates  arguments  before  applying 
them  to  a  concerned  procedure.  The  semantics  of  higher-order  computation  based  on 
call-by-value  evaluation  has  been  widely  studied  by  many  researchers  in  the  context  of 
domain  theory,  cf.  [35,  23,  32,  12,  40,  11],  through  which  it  has  become  clear  that  the 
semantic  framework  for  the  call-by-value  computation  has  a  basic  difference  from  the  one 
for  call-by-name  computation  (see  [15,  42]  for  introduction  to  the  topic).  The  difference 
between  the  semantics  of  cali-by-value  and  that  of  call-by-name  in  this  context  may 
roughly  be  captured  as  the  difference  in  the  classes  of  involved  functions:  in  call-by- 
name,  we  take  any  continuous  functions  between  pointed  epos,  while,  in  cali-by-value, 
one  takes  strict  continuous  functions.  The  latter  is  also  equivalently  presentable  as 
partial  continuous  functions  between  (possibly  bottomless)  epos.  This  distinction  leads 
to  a  basic  algebraic  difference  of  the  induced  categorical  universes,  cf.[ll,  12]. 

The  present  paper  offers  a  semantic  analysis  of  call-by-value  computation  from  a 
different  angle,  based  on  elements  of  game  semantics.  In  game  semantics,  computation 
is  modelled  as  specific  classes  of  interacting  processes  (called  strategies)^  which,  together 
with  a  suitable  notion  of  composition,  form  a  categorical  universe  with  appropriate  type 
structures.  One  may  compare  this  approach  to  Bohm  trees  or  to  sequential  algorithms  [6, 
22],  in  both  of  which  computation  is  modelled  not  by  set-theoretic  functions  of  a  certain 
kind  but  by  objects  with  internal  structures  which  reflect  computational  behaviour  of 
the  concerned  class  of  computation.  Game  semantics  has  its  origin  in  Logics  [7,  10] 
and  has  been  used  for  the  semantic  analysis  of  programming  languages,  especially  for 
characterising  the  notion  of  sequentiality  [8,  34].  By  concentrating  on  specific  forms  of 
interaction  which  obey  a  few  basic  constraints,  the  approach  makes  it  possible  to  extract 
desired  classes  of  interacting  processes  at  a  high-level  of  abstraction,  offering  suitable 
semantic  universes  for  varied  calculi  and  programming  languages,  cf.  [2,  3.  4,  19,  20,  24]. 
The  forms  of  interaction  in  these  universes  are  however  inherently  call-by-name:  it  has 
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not  been  clear  how  the  call-by-value  computation  can  be  captured  in  the  setting  of  game 
seniantics,  in  spite  of  its  equally  significant  status  as  a  mode  of  computation. 
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Figure  1 


In  the  present  work  it  will  be  shown  that  a  general  semantic  universe  of  the  call- 
bv-value  higher-order  computation  can  indeed  be  simply  constructed,  employing  basic 
elements  of  the  foregoing  game  semantics,  but  with  a  key  difference  in  the  structures 
of  interaction.  More  specifically,  we  find  that  the  distinction  between  call-by-name  and 
call-by-value  in  game  semantics  arises  as  the  one  in  the  form  of  the  flow  of  information. 
Let  us  illustrate  this  point  by  simple  examples.  Figure  1  (a)  depicts  how  a  function 
which  doubles  a  given  natural  number  is  modelled  in  the  foregoing  game  semantics  (“O” 
for  Opponent,  “P”  for  Player,  “A”  for  Answer,  and  “Q”  for  Question).  Computation 
starts  when  Opponent  asks  a  question  on  the  right,  requesting  an  answer:  then  Player 
(the  function)  asks  what  the  argument  is  on  the  left,  from  which  the  number  is  received, 
and  finally  it  returns  to  the  right  to  answer  the  initial  question  by  the  double  of  the 
received  number.  In  Figure  1  (b),  the  same  function  is  modelled  in  the  call-by- value 
game.  This  time  the  flow  starts  at  the  left  component,  which  already  carries  a  value: 
then  the  function  just  returns  the  answer  on  the  right.  One  may  notice  that  this  means 
the  interaction  should  start  from  an  answer,  which  might  be  regarded  as  an  anomaly  in 
the  preceding  convention  in  game  semantics.  However,  it  turns  out  that  this  parameter 
of  games  —  whether  one  initiates  a  game  by  answers  or  by  questions  is  orthogonal 
to  other  basic  elements  of  the  game  semantics,  leading  to  a  simple  construction  of  a 
categorical  universe  in  which  representative  functional  calculi  based  on  call-by-value 
evaluation  can  be  faithfully  interpreted.  The  independence  of  the  parameter  suggests 
we  may  obtain  a  suitable  universe  to  model,  say,  imperative  call-by-value  computation 
bv  simply  altering  other  parameters,  cf.  [4,  21].  We  also  note  that  the  possibility  to 
model  ‘’data-driven  computation”  in  contrast  to  “demand-driven  computation”  as  games 
is  discussed  in  an  early  paper  on  game  semantics  by  Abramsky  and  Jagadeesan  [2]. 

The  main  technical  contribution  of  the  present  work  is  the  validation  of  the  semantic 
exactness  with  which  the  induced  universe  captures  the  call-by-value  sequential  higher- 
order  computation  through  the  full  abstraction  result  for  the  call-by-value  version  of 
PCF  [.35,  40],  a  paradigmatic  functional  calculus.  The  result  seems  the  first  one  in  this 
context^  and  is  easily  extendable  to  other  languages  as  we  shall  indicate  in  Section 
6.  We  also  clarify  the  relationship  between  the  present  universe  of  games  and  the 
corresponding  call-by-name  universe  by  showing  they  are  faithfully  embeddable  to  each 
other.  These  results  indicate,  together  with  the  preceding  results  on  call-by-name  PCF 
[3,  19).  that  the  two  basic  notions  of  calling  procedures  in  higher-order  computation  are 
representable  in  the  game-based  semantic  framework  in  an  exact  way,  and  that  they 


*  Independently  and  concurrently  Riecke  and  Sandholm  [38]  obtained  a  similar  result,  see  Section  6. 
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arise  as  two  independent,  though  mutually  related,  semantic  universes  with  equal  status 
(which  parallels  the  findings  in  domain  theory,  cf.  [11]).  It  is  also  notable  that,  as  we 
clarify  later,  the  universe  of  call-by-value  games  assumes  basic  type  structures  which 
have  arisen  through  the  categorical  analysis  of  domain-theoretic  universes  for  call-by¬ 
value,  or  partial,  computation,  cf.[ll,  12,  23,  31,  32,  36,  39],  though  with  a  strong 
intensional  flavour.  This  suggests  an  abstract  notion  of  “call- by- value  computation" 
may  be  delineated  apart  from  the  standard  domain  theoretic  constructions,  cf.  [11,  12]. 

The  structure  of  games  we  shall  use  is  a  conservative  extension  of  the  construction 
by  Hyland  and  Ong  [19].  The  relationship  is  detailed  in  [18]. 

This  is  an  extended  abstract  of  [18].  The  reader  may  refer  to  [18]  for  proofs  and 
detailed  technical  discussions.  In  the  remainder,  Section  2  introduces  the  basic  notion  of 
games  and  strategies.  Sections  3  and  4  outline  the  algebraic  structures  of  the  category 
of  games  and  its  extensional  quotient.  Section  5  establishes  the  main  result  of  the  paper, 
the  inequational  full  abstraction  for  call-by-value  PCF.  Section  6  discusses  further  results 
and  remaining  topics.  Appendix  briefly  reviews  call-by- value  PCF. 

2.  Games  and  Strategies 

This  section  introduces  the  basic  construction  of  games  and  strategies  which  are  to 
become  objects  and  morphisms  in  the  categorical  universe.  We  start  from  sorting  (the 
terminology  is  from  [29]),  from  which  call-by- value  types  arise  as  its  specific  subclass. 

2.1.  Sorting  and  Type. 

(i)  (sorting)  A  sorting  §  is  a  triple  of:  (1)  §,  which  is  a  collection  of  mutually  disjoint 
non-empty  sets  ranged  over  by  5,  5', . . .  each  called  a  sorf,  (2)  A  :  §  ->  {  [,  (,  ],  )  }, 
a  labelling  function  and  (3)  Obs  :  §  — )-  2-,  the  justification  relation  (if  .5^  6  Obs{S) 
we  say  S  justifies  5'),  where  S'  €  Obs[S)  implies: 

.  X[S)  =  [  then  X{S')  6  {  (,  ]  }•  Dually  X{S)  =  (  then  X{S')  G  {  [,  )  }. 

•  A(5)  ]  then  X{S')  =  [  always.  Dually  A(5)  = )  then  A(5')  =  (  always. 

Elements  of  a  sort  are  called  actions,  denoted  writing  e.g.  when  x  e  S. 

The  set  of  initial  sorts,  denoted  init(§),  is  given  as  {5  |  for  no  5'  £  §.5  G  06s(5')}. 

(ii)  (type)  A  cbv-type,  or  simply  a  type,  is  a  sorting  such  that  all  initial  sorts  are 
labelled  by  “]”  and  any  of  its  sorts  is  reachable  from  some  initial  sort,  where 
reachability  is  understood  regarding  sortings  as  graphs  (nodes  are  sorts,  directed 
edges  are  given  by  06s).  Types  are  denoted  by  A,B,C,  — 

An  action  of  a  sort  labelled  by  each  of  “  (,  [,],)”  is  called,  respectively.  Player  Question. 
Opponent  Question,  Player  Answer,  and  Opponent  Answer,  the  first  two  collectively 
Question,  the  last  two  Answer,  the  first  and  third  P-action,  and  the  second  and  fourth 
0-action.  Answers  of  initial  sorts  are  often  called  signals.  On  labels  we  define  a  self- 
inverse  function  (Oi  giving  the  dual  of  a  label,  satisfying:  [  =  (  and  ]  =). 

2.2.  Examples,  (sorting) 

(i)  0  is  the  empty  sorting,  which  is  a  type.  1  is  a  sorting  whose  unique  ]-labelled  sort 
is  a  singleton,  which  is  again  a  type,  nat  is  made  as  1  replacing  a  singleton  with 
uj  (the  set  of  natural  numbers),  similarly  bool  with  {true,  false}. 

(ii)  Given  S,  write  S  for  the  sorting  which  is  the  result  of  changing  labels  by  (■).  So 
nat  is  the  sorting  with  the  same  sort  as  nat  which  is  however  labelled  by  “)”.  Next, 
given  Si  and  S2,  let  Si  i±l  S2  denote  their  disjoint  union,  i.e.  the  sorts  are  the 
disjoint  union  of  Si  and  S2,  inheriting  labelling  and  justification.  Then  nat  !±l  nat 
is  the  sorting  with  two  copies  of  labelled  by  “)”  and  “]”. 
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(iii)  We  define  nat^nat  as  a  type  with  three  sorts,  one  is  a  singleton  written  jnat^nat^ 
another  a  copy  of  w  written  and  the  third  again  a  copy  of  cj  written 

for  which  labels  are  given  as  these  notations  indicate.  The  justification  is  given  so 
that  jnat^nat  only  justifies  which  in  turn  only  justifies 

By  a  sequence  from  a  set  X  we  mean  a  partial  function  from  oj  to  X  defined  for  a  finite 
initial  segment  of  (called  indices)  and  undefined  for  the  rest.  As  an  example,  ahc  has 
{0, 1,  2}  as  its  indices.  We  often  confuse  elements  and  their  occurrences  in  a  sequence. 
£  denotes  the  empty  sequence.  We  are  interested  in  sequences  of  actions  representing  a 
certain  kind  of  interaction  between  an  agent  (Player)  and  the  outside  (Opponent). 

2.3.  Action  Sequence.  Given  a  sorting  S,  an  action  sequence  in  S  or  often  simply  a 

sequence  in  S  is  a  sequence  from  actions  in  §  (let  is  be  together  with  the 

relation  on  its  indices  denoted  H-  (writing  Xi  xj  for  2 j),  satisfying: 

(consistency)  (1)  Xj  »-)•  xj  =>  f  <  j,  (2)  (j^,-  x^  A  xj  M-  x;c)  i  =  j,  (3) 
xf  i->  xf'  =>  S'  e  Obs{S),  (4)  Xi  >-¥  xf  S  initial  (then  xj  occurs  free), 

(linearity  in  answers)  (5)  (x,-  e-y  xj  A  x,-  i->  x/c  Ax*  an  answer)  =>■  j  =  k,  (6)  A 
free  0-answer  (resp.  a  free  P-answer)  occurs  at  most  once,  and: 

(strict  alternation)  (7)  If  x,-  is  a  P-action  (resp.  0-action),  then  x^+i  is  an  0-action 
(resp.  P-action)  for  0  <  2  <  n  —  2. 

s,  s',...  range  over  action  sequences,  often  leaving  the  associated  i->-  implicit.  We  say  x,- 
justifies  Xj  when  x,-  Xj.  On  action  sequences  we  define  two  functions,  '“s'',  the  P-view 
of  s,  and  lSj,  the  0-view  of  s,  as,  inheriting  whenever  possible:  (pvO)  =  e, 
(pvl)  '“sx:”'  =  Xi  when  x,-  is  a  free  0-action,  (pv2)  ^SoX,siXj“'  =  ^so'^XiXj  when 
X,-  Xj  and  Xj  is  an  0-action,  and  (pv3)  ^sqxP  =  ^SQ^Xi  if  x,-  is  a  P-action;  uSj  is 
defined  dually,  i.e.  by  exchanging  “0-action”  and  “P-action”  throughout.  We  then  say: 

(i)  s  is  well-bracketed  when:  ifsoXiSiXj  is  a  prefix  of  s  such  that  (1)  x,-  is  a  question 
(2)  Xj  is  an  answer  and  (3)  either  Xj  occurs  free  or  xy  is  justified  by  a  question  in 
So,  then  x,-  justifies  an  answer  in  si. 

(ii)  s  satisfies  the  visibility  condition  when,  in  any  of  its  prefix  soXi  where  x,-  is  a  P- 
action  (resp.  0-action)  which  is  yj  s.t.  j/y  i-f  x,  always  occurs  in  ‘“so"'  (resp.  lSo-j)- 

An  action  sequence  is  legal  when  it  is  well-bracketed  and  satisfies  the  visibility  condition. 
Legal  action  sequences  are  sometimes  called  legal  positions.  We  can  verify  the  set  of  legal 
sequences  of  any  sorting  is  closed  under  prefix  and  view  constructions. 

We  are  now  ready  to  give  the  main  definition  of  this  section,  which  determines  the 
class  of  interacting  processes  we  are  concerned  with  in  the  present  study. 

2.4.  Definition,  (strategy)  An  innocent  strategy  from  A  to  B,  or  simply  a  strategy 
from  A  to  B,  IS  a.  prefix-closed  set  a  of  legal  positions  in  A  1+)  5,  such  that: 

(0-initial)  s  G  cr  implies  the  initial  action  of  s  (if  any)  is  an  0-action. 

(contingency  completeness)  s  e  (t  and  sx,-  is  legal  for  an  0-action  x,-  imply  sxi  G  <t. 
(innocence)  If  sjx,  S2  €  o’,  x  is  a  P-action  and  =  '“52“',  then  S2y  G  cr  such  that  (1) 
''six'^  =  ''s2«/~'  and  (2)  S2Z  ^<t  =>  52^:  =  S2y- 
We  write  cr  :  A-y  B  when  tj  is  a  strategy  from  A  to  B.  fa  denotes  the  partial  function 
determined  by  cr,  mapping  even-length  P-views  to  next  actions  (if  any)  with  justification. 
Given  cr,  r  :  A-^B,  we  set  o-  <  r  when  cr  C  ",  equivalently  when  fa  C  fr- 

Using  the  function  representation,  it  is  easy  to  see  the  set  of  strategies  from  .4  to  B  forms 
a  dl-domain  under  <,  where  compact  elements  are  those  with  finite  graphs.  Further, 
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given  (T  :  A  B,  if  Xi  and  x‘,+i  come  from  different  types  then  arj+i  is 

necessarily  a  P-action  (switching  condition).  Also  the  projection  of  s  G  cr  :  Ai  — >■  A2 
onto  Ai  (/■  =  1,2),  written  s  f  A,-,  is  always  legal  in  Aj. 

2.5.  Examples,  (strategies) 

(i)  (undefined)  For  each  A  and  B,  there  is  a  strategy  from  A  to  B  which  is  totally 
undefined,  so  that  it  is  least  w.r.t.  the  ordering  <.  We  write  this  strategy  Xa-^b- 

(ii)  (first-order  function)  The  set  of  strategies  from  nat  to  nat  precisely  correspond  to 
the  set  of  partial  functions  from  u;  to  cj. 

(iii)  (higher-order  function)  We  describe  a  strategy  a  :  nat^nat  nat  which  corres¬ 
ponds  to  the  behaviour  of  an  open  call-by- value  PCF-term,  x  1 1  it>succ(;c3)  :  l. 
After  receiving  a  signal  on  the  left,  which  is  a  function,  a  asks  the  result  of  ap¬ 
plying  3  to  that  function,  and,  on  receiving  the  answer,  returns  its  successor  to 
the  right.  Except  the  last  free  answer,  each  action  is  justified  by  the  preceding  one. 

Strategies  denote  a  certain  kind  of  deterministic  processes,  and  are,  as  such,  precisely 
representable  as  (name  passing)  synchronisation  trees,  see  [18].  The  presentation  is 
often  useful  for  describing,  and  reasoning  about,  strategies:  indeed  the  full  abstraction 
result  was  originally  obtained  in  this  setting  [17].  The  following  inductive  definition  of 
composition  of  strategies  is  suggested  by  such  representation. 

2.6.  Definition,  (composition)  Given  (T  :  A— vB  and  r  :  B-^C,  we  set: 

{si;s2  1  Si  6  (T,  S2  6  r,  Si  [  B  =  S2  [  B} 

where  Si;S2  with  sy  and  S2  as  above  is  given:  (1)  e;e  =  £y  (2)  six^\S2^^  ~  si;52 
(x^  is  the  corresponding  dual  action  of  x^),  and  (3)  six^;52  =  (si;  52)^:"^,  Si\S2^^  = 
{si\S2)x^y  in  each  caae  inheriting  the  justification  relation  from  the  original  pair. 

(3)  above  is  well-defined  since  two  cases  are  always  disjoint  due  to  the  switching  condi¬ 
tion.  We  can  also  verify:  (i)  tj;  r  is  a  strategy  from  A  to  C,  (ii)  ;  is  associative  with 
identity  given  by  the  copy-cat  strategy,  i.e.  that  which  exactly  copies  actions  between 
A  and  A,  and  (iii)  ;  is  bi-continuous  with  respect  to  <.  Thus  we  define: 

2.7.  Definition.  CBV  denotes  the  category  of  cbv-types  and  innocent  strategies. 

By  the  preceding  discussions,  CBV  is  enriched  over  CPO,  the  category  of  possibly  bot¬ 
tomless  epos  and  continuous  functions.  Each  homset  has  a  least  element  ±  for  which 
the  composition  is  left  strict,  that  is  X;  =  ±  always. 

3.  Intensional  U.n'iverse 

Type  structures  of  a  semantic  universe  offer  the  basic  articulation  of  its  algebraic  struc¬ 
tures  needed,  for  example,  for  interpreting  various  programming  languages  in  it.  This 
section  clarifies  the  basic  type  .structure  of  CBV  in  the  light  of  the  distinction  between 
total  and  partial  maps.  We  first  introduce  the  notion  of  totality,  cf.  [13]. 

3.1.  Definition,  cr  is  total  when  r;cr  —  ±  implies  r  “  X.  We  write  a  JJ-  when  cr  is  total. 

The  totality  of  cr  :  A  B  is  equivalent  to  any  one  of:  (1)  Vr  :  1  -4  A.  r  Jj.  r;  tr  ij-. 
(2)  the  square  (0  — )•  .4  B,  0  -4  0  —>  B)  is  a  weak  pullback  (notice  0  is  initial  and 
weakly  terminal),  and  (3)  cr  immediately  emits  the  P-signal  for  each  initial  0-signal.  (1) 
relates  to  a  familiar  idea  of  totality.  (2)  is  a  categorically  basic  one,  and  (3)  gives  the 
behavioural  characterisation,  clarifying  the  dynamic  aspect  of  totality. 
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3.2.  Examples,  (total  maps) 

(1)  The  unique  arrow  J_  from  0  to  any  type  is  total,  by  definition.  All  isomorphisms 
are  total.  Also,  there  is  no  total  map  to  0,  except  from  itself. 

(ii)  There  is  a  unique  total  map  for  each  A.  It  reacts  to  the  initial  signal 

(if  any)  by  the  unique  P-signal  at  1,  and  no  more  action  is  possible. 

(iii)  cr  :  nat->nat  is  total  iff  the  underlying  number-theoretic  function  is  total. 

Let  us  denote  CBVt  for  the  category  of  types  and  total  strategies.  Since  totality  is  closed 
upwards  w.r.t,  <,  CBVt  again  CPO-enriches.  It  has  finite  products:  3.2  (ii)  above  shows 
1  is  terminal,  while  the  product  of  A  and  B  is  given  by  a  type  A^B  whose  sorts  are  the 
disjoint  union  of  non-initial  sorts  of  A  and  B  together  with,  for  each  pair  of  5  G  init(A) 
and  S'  G  init(jB).  a  sort  =  S  x  S'  (the  set  theoretic  product),  which  justifies  what 
S  and  S'  justify  in  A  and  B,  the  rest  as  in  A  and  5  (A  fS)  0  and  0  0  A  are  set  as  0). 
Projection  maps  are  evidently  given.  0  is  often  denoted  x  in  CBVt-  We  also  note  that 
CBVt  has  arbitrary  (small)  products  and  co-products,  but  we  do  not  need  them  here. 

The  relationship  between  total  maps  and  usual  (often  called  partial)  maps  is  clarified 
by  the  notion  of  lifting.  Write  A±  for  the  type  given  by  adding  two  singleton  sorts  to  A, 
one  initial  which  justifies  the  other  one,  the  latter  justifying  all  S  G  init(A),  the  rest  as 
in  A.  Then  we  can  see  the  set  of  total  arrows  from  1  to  Ax  is  order-isomorphic  to  the  set 
of  partial  arrows  from  1  to  A.  These  two  are  mediated  by  two  copy-cat  like  strategies, 
up  ;  and  dn  :  Ax  — >■  A,  with  obvious  behaviours  (up  reacts  to  an  initial  action  at 

A  by  going  though  two  added  actions  at  Ax  then  does  the  copy-cat:  dn  just  does  the 
dual).  In  a  familiar  way  this  induces  the  adjoint  situation  as  described  below. 

3.3.  Proposition. 

(i)  Let  F  be  the  inclusion  functor  from  CBVt  to  CBV.  Then  F  has  the  right  adjoint  T, 
with  T(A)  =  Ax,  the  unit  tja  =  up,  and  the  co-unit  e  =  dn,  which  CPO-enriches. 
The  monad  (T,  tja,  T{dn))  is  denoted  T,  which  has  a  tensorial  strength  stA,B 
and  a  co-strength  (in  the  sense  of  [37])  jg. 

(ii)  The  Kleisli  category  of  T  on  CBVt  is  isomorphic  to  CBV,  We  write  u-f  for 
up;T(<t)  :  a  ->  jBx  where  a  :  A— ^  B  is  partial,  and  for  o-;dn  :  A  -)■  B  where 
<7  :  A—^B±  is  total. 

Using  the  monad  T,  we  can  now  present  the  basic  type  structures  of  CBV.  In  (iii)  below 
A-^B  is  a  type  whose  sorts  are  the  disjoint  union  of  those  of  A  and  B  together  with 
new  which  is  a  singleton,  with  the  label  of  each  S  G  init(A)  changed  into  [  and 

those  of  A’s  non-initial  sorts  dualised.  Justification  is  as  in  A  and  5,  with  the  addition 
justifying  what  were  in  init(A),  each  of  which  in  turn  justifying  what  were  in 
\mt{B)  {0-^B  is  set  as  1).  Notice  the  similarity  with  the  construction  of  A  W  B. 

3.4.  Definition  and  Proposition. 

(i)  (partial  pairing  [32])  Given  <ti  :  C  — >•  A  and  cr2  :  C  B,  their  left  pairing, 
((ui,  (j2))i  :  C  ->■  A  0  jB,  and  the  right  paring,  {{(Xi,  (72))r  :  C  ->•  A  0  5  are  given 

as:  <T2))i  ='  ((cri,  <^2)}r  =  ((o’},  ^2);  t.',4,s)t  where 

=  st'^j^s;T{stA,B',^^)  and  tpA,B  =  ^^A,TB',T{st'^^^;6n). 

(ii)  (premonoidal  tensor  [.37])  Given  A,  we  define  A0  and  0A  by:  (i)  .4  0  -5  =  A@  5 
and  B  ■  0A  =  50  A,  and  (ii)  .4  0  (j  (("i,  7r2;  a)),  and  (T0  A  =  ((~i;  cr,  7r2» 
where  tt,-  denote  projections.  Then  A0  and  0A  both  define  functors  on  CBV 
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which  CPO-enrich.  We  then  define,  for  cr  :  .4  — >  and  t  :  C  D:  {[)  a  r  = 

{<T  Cl)  C);  {B  0  r)  and  (ii)  (j  0^  r  =  (.4  ®  a);  (r  0  C). 

(iii)  (partial  exponential  [23])  The  functor  :0  .4  :  CBVt  ~)-CBV  has  the  right  adjoint 
4.^*  :  CBV  — >■  CBVty  which  CPO-enriches.  Equivalently,  there  exists  an  arrow 
ev  :  (4^B)  0  A B  such  that,  for  any  cr  :  C  0  4  — >  B,  there  is  a  unique  total 
arrow  pA((7)  :  (7 ->  B  satisfying  (pA((T)0id);  ev  =  c,  andpA  is  a  continuous  operator. 

An  outstanding  fact  on  partial  pairing  is  that  the  right  and  left  pairings  of  the  same 
tuple  do  not  coincide  in  general.  This  exhibits  a  strongly  intensional  character  of  CBV, 
substantiating  Moggi’s  remark  that  ((ui,  (J2))/  and  (((jj,  a2))r  reflect  the  “order  of  evalu¬ 
ation’"  [32].  This  also  implies  the  tensor  in  CBV  does  not  give  a  bifunctor,  cf.  Corollary 
4.3  of  [37].  We  write  ((tri,  <72))  when  two  versions  coincide  (as  when  either  is  total). 

The  final  structure  we  need  is  recursion,  here  presented  as  an  operator  on  each  homset. 
[18]  gives  an  alternative  presentation  as  constants.  Below  we  say  A  is  pointed  when  it 
has  a  unique  initial  sort  which  is  a  singleton,  equivalently  when  hom(l,4)  in  CBVt  is 
a  pointed  cpo.  For  such  4,  :  T4  A  denotes  the  unique  total  map  such  that 

up^;dn'^  =  id^.  Pointed  types  are  precisely  objects  in  the  category  of  Eilenberg-Moore 
algebra  of  the  monad  T.  Also,  any  type  of  form  4'=^B  is  pointed. 

3.5.  Proposition.  Let  A  be  pointed  and  cr  :  C  x  4  ->  4.  Then  there  is  a  strategy 
rec((T)  :  C— )-4  which  satisfies:  (i)  r;rec((j)  =  r;((idc,  rec((j)));cr  for  r  :  (if  (T 

is  total  we  can  take  off  r  from  the  equation),  (ii)  rec(r  0  id^; cr)  =  r;rec(cr)  for  each 
T  :  B— >C,  and:  (iii)  Given  r  :  1-^C,  if  {pj  :  1— is  defined  as:  (1)  po  =  J-,  (2) 
pj.,.!  =  ((r,  p|;dn')};<j,  then  {pi}  is  an  increasing  cj-chain  such  that  Up,-  =  r;rec(cr). 

4.  Extensional  Universe 

CBV  represents  an  abstract  notion  of  execution  of  call-by-value  computation.  For  the 
interpretation  of  programming  languages  at  the  same  abstraction  level  as  in  the  standard 
semantic  universe  like  the  category  of  domains,  we  may  need  a  more  abstract  universe, 
which  we  construct  from  CBV  by  a  simple  quotient  construction.  The  universe  is  also 
useful  for  understanding  the  behaviour  of  arrows  in  CBV  in  an  abstract  way.  Below  we 
briefly  outline  the  basic  structure  of  this  universe,  leaving  details  to  [18].  We  start  from 
the  following  ordering  (cf.  [36,  11]): 

cti;^cT2  «  VC,  C',  r;C'-)-A  r':B-+C'. 

Immediately  is  a  preorder  for  which  the  composition  is  monotone  (thus  the  quotient  is 
well-defined),  and  <C;;<.  We  now  define  CSV  as  the  category  of  types  and  ;;<-equivalence 
classes  of  strategies.  range  over  arrows  in  CBV.  The  induced  partial  order  is  still 

denoted  CBV  is  enriched  over  Poset,  the  category  of  posets  with  monotone  maps, 
since  monotonicity  carries  over  from  CBV.  Observing  0  is  the  zero  object  in  CBV  (i.e. 
both  terminal  and  initial),  we  define  X  :  4— ^B  as  the  unique  map  that  factors  through 
0,  cf.  [13].  Then  X  is  indeed  the  le2Lst  element  in  each  homset,  and  the  composition 
is  strict  at  both  sides.  We  can  then  define  total  maps  as  before:  /  JJ-  when  g;  f  =  ± 
implies  p  X  for  each  g,  equivalently  when  the  square  (0  — >■  4  -4-  B,  0  —>  0  — )•  B) 
is  a  pullback,  from  which  all  properties  of  total  maps  as  in  CBV  follow.  Notice  also 
f  V(7  €  /.  <7  h- 3(7  €  /.  <7  If.  We  write  CBVt  for  the  subcategory  of  total  maps. 

We  can  then  show  CBVt  is  well-pointed,  with  finite  products  (indeed  all  small  products 
and  co-products)  inducing  Poset-enriched  bi-functors,  all  inheriting  from  CBVt.  .Again 
as  in  CBV.  the  inclusion  functor  from  CBVt  to  CBV  has  the  right  adjoint  inheriting 
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constructions  from  T,  which  we  write  again  T,  which  Pose t-en riches.  The  corresponding 
monad,  again  denoted  T,  has  strengths  and  is  now  commutative,  i.e.  ipA,B  =  ^a,b  in  3.4 

(i).  Again  the  Kleisli  category  of  T  on  cSVt  isomorphic  to  CBV.  Using  the  monad, 
we  can  now  clarify  the  basic  type  structur^f  CBV.  Thus,  again  from  the  general  result 
by  Power  and  Robinson  [37],  we  know  CBV  is  a  Poset-enriched  symmetric  monoidal 
category,  which  has  all  type  structures  as  given  in  Proposition  3.4  (i)(ii)(iii}  inheriting 
the  constructions  from  CBV,  wh^left  and  right  pairings  are  identified.  Finally  the 
recursion  in  CBV  carries  over  to  CBV,  though  all  r  :  1— >-C  in  3.5  can  be  replaced  with 
idc.  We  also  note  that  CSV  allows  the  treatment  of  recursive  types  for  a  large  class  of 
functors,  but  we  do  not  use  them  in  the  present  paper. 

5.  Interpretation  of  PCFv 

PCFv[35,  36]  is  a  typed  programming  language  based  on  call- by- value  evaluation.  The 
syntax  and  evaluation  rules  can  be  found  in  the  standard  literature,  cf.[lo,  42,  40],  which 
are  briefly  reviewed  in  Appendix  (following  [15]  except  the  recursion  is  only  defined  for 
function  types,  cf.[42,  40]).  CBV  and  its  extensional  quotient  are  conceived  to  represent 
call-by-value,  or  partial,  higher-order  functional  computation.  Moreover  it  has  a  type 
structure  which  does  include  that  of  PCFv  Thus  we  may  seek  to  represent  PCFy-terms 
and  its  computation  in  these  universes.  We  primarily  consider  the  interpretation  in  CBV, 
and  only  move  to  CBV  at  the  last  step.  The  interpretation  follows. 

5.1.  Definition.  First  we  define  the  mapping  from  the  set  of  types  and  environments  of 

PCFv  to  objects  in  CBV  as:  [tj  nat,  [o]  bool,  {a  /3]  {sj  1  and 

[r,  X  :  a]  [rj  0  [or].  Then  the  mapping  from  PCFy-terms  to  arrows  in  CBV  is  given 
inductively  as  follows,  assuming  either  of  the  left/right  pairings  is  selected  uniformly. 

(i)  [F,  x  :  a,  Ao  £  :  qJ  tt  :  [F]  0  M  0  [A],  where  tt  is  an  appropriate  projection. 

(ii)  [F  D>  Xx^.M  :  a  ::^  /?]  pA((j)  :  [Fj  [/?],  where  [F,  x  :  a  >  A/  :  /?]  =  (7. 

(iii)  [F  >  MN  :  /3]  ((cri,  cr2));  ev  :  |F]  -A-  \p\,  where  |F  t>  M  :  a  /?|  =  cri  and 
[F  >  TV  :  a|  =  <72- 

(iv)  [F  0  ^x^.M  :  a]  rec(cr)  :  |F]  [a],  where  [F,  x  :  a  >  M  :  a]  =  (7 

(v)  [Focond  L  Ml  M2  :  a]  {<(r,  ((cr},  o-J})});  7T([a])  )|  =  FI -5^  [a]  where  [F  >  X  : 
o|  —  r,  |F  >  Ml  :  a|  =  <7i,  fF  t>  M2  :  aj  =  (72  and  7^  :  bool  0  A  0  A  — )■  A  is  a 
strategy  with  an  appropriate  behaviour. 

(vi)  For  a  constant  c  of  type  a,  we  set:  |F  >  c  :  a]  -[r];  c  :  [F]  ^  1  -)•  [a]  where 

3 ;  |q]  is  given  as  a  strategy  with  obvious  behaviour  for  each  c. 

The  descriptions  of  7  and  c  for  each  c  are  given  in  [18].  As  basic  properties  of  the 
mapping,  we  know  |F>U  :  q:|  is  always  total,  where  V  denotes  a  value,  i.e.  an  abstraction 
or  a  non-Q  constant;  [Fi>  M{U/x}  : /?]  =  ({fd[r] ,  t-));  cr  :  |FJ  — )■  [/?]  for  any  r  =:  [F >  U  : 
<>]  and  0-  =  [F.  x  :  a  >  M  :  /5];  and  that  T  >  M  Ij.  V  implies  [M]  =  [Vj.  We  can  then 
verify  the  following  key  properties  of  the  interpretation. 

5.2.  Proposition. 

(i)  (computational  adequacy)  |M|  _L  iff  3V.  M  V  for  a  closed  M. 

(ii)  (adequacy)  {M]  :<  [iV]  implies  M  :<obs  A"  fen*  closed  M,  N  of  the  same  type. 
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Given  the  adequacy  result,  if  we  show  its  converse,  i.e.  :<obs  implies  via  the  inter¬ 
pretation,  then  we  obtain  the  full  abstraction.  For  the  purpose  it  suffices  to  prove  all 
compact  elements  of  appropriate  types  are  PCFv-definable,  cf.[25,  35].  The  definability 
argument  is  carried  out  using  a  subset  of  PCFv~terms  defined  as  follows. 

5.3.  Definition.  Finite  canonical  forms  (FCFs  for  short)  are  inductively  given  as: 

(i)  r  >  Q  :  a  and  T  >n  :  t  are  FCF’s. 

(ii)  r  >  Ay“ .iW  :  o ->■ is  a  FCF  if  F,  y  :  a  >  M  : /?  is. 

(iii)  r  >  let  =  zV  in  N  :  l3  is  a  FCF  if  (1)  T,y  :  a>  N  :  (3  is  a  FCF,  (2)  ::  has  a 
type  13  =>  a  in  F,  and  (3)  T>V  :  (3  is  a  FCF  (which  is  also  a  value). 

(iv)  F  c>  (case  i  of  ni :  Mi  []  n2  :  M2  [j.-Q  :  Mk  )  :  o  is  a  FCF  if  a:  :  i  £  F  and,  for 

each  /,  F  t>  A/,-  :  a  is  a  FCF. 

where,  in  (iii),  let  y“  ^  zM  in  N  stands  for  {Xy^.N){zM),  and,  in  (iv),  case  y  of  ni : 
Ml  []..[]  UA..:  Mk  stands  for  cond  (y  =  ui)  Mi(...(cond  (y  =  Uk)  Mk  fi)..)  ,  the  latter 
assuming  the  equality  check  is  suitably  encoded  in  PCFy 

FCFs  faithfully  capture  the  behaviour  of  compact  strategies  of  PCF-types: 

(i)  denotes  ±.  m  :  i  immediately  returns  m  after  an  initial  0-signal. 

(ii)  Xx^.M  \  a  (3  represents  a  strategy  which,  after  an  initial  0-signal,  does  a 

sequence  of  actions  [“  (here  an  annotated  label  denotes  an  action  of  that 

kind)  where  t-)-  [“,  then  behaves  as  M. 

(iii)  r,  Xi  ;  71  72,  Ai>let  y®  =  XiM  in  N  \  ^  first  interacts  at  Xi  by  ('^%  then  Oppon¬ 

ent  may  ask  at  M  (when  71  is  a  higher-order  type)  which,  after  some  interactions, 
will  be  answered  by  Player,  followed  by  an  Opponent  Answer  )^=.  Then  the  ac¬ 
tions  move  to  iV.  Here  the  “let”  construct  is  used  to  make  the  order  of  evaluation 
explicit  (see  [32]  for  a  similar  use  of  the  construct  in  a  different  conte.xt). 

(iv)  The  case  statement  corresponds  to  the  situation  when  a  strategy  acts  according 
to  the  received  ground  values  (here  natural  numbers).  A  vector  of  values  can  be 
handled  by  nesting  the  construct. 

Using  FCFs  we  can  prove: 

5.4.  Theorem,  (definability)  For  each  compact  element  :  1  ->  [a]  for  any  PCFv-type 
a  in  CBV,  there  is  a  FCF  F  :  a  such  that  [F  :  a]  =  a.  Conversely,  the  interpretation 
of  any  FCF  is  a  compact  element  in  the  respective  type. 

The  proof  is  by  induction  on  the  cardinality  of  compact  elements,  translating  the  beha¬ 
viour  of  strategies  into  the  corresponding  FCFs  based  on  the  correspondence  between 
actions  and  strategies  we  illustrated  above.  We  note  that,  like  FCFs  themselves,  the 
argument  is  much  simpler  than  the  corresponding  one  in  call-by-name  PCF.  cf.[19].  See 
[18]  for  details.  Write  [F  >  M  :  a]e  for  [[r>  M  :  q;]]-(.  From  the  definability  result  we 
can  now  conclude: 

5.5.  Theorem,  (full  abstraction)  For  closed  PCFy-terms  M  :  a  and  .V  :  a,  we  have 

M  :  a  :  a  iff  [M  :  ole  ;:j  {N  :  aje- 


6.  Discussio.ns 

6.1.  Further  Results.  First  we  briefly  outline  how  call-by-name  universe  and  the  call- 
by-value  universe  are  mutually  embeddable,  as  in  the  context  of  domains.  Let  cbn-types 
be  sortings  in  which  (1)  initial  sorts  are  all  opponent  questions  and  (2)  each  sort  is 
reachable  from  some  initial  sort.  The  strategies  are  then  as  in  Definition  2.4  with  an 
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added  condition  which  ensures  the  switching  condition.  The  composition  of  strategies  is 
just  as  in  Section  2,  based  on  which  we  obtain  the  category  of  cbn-types  and  innocent 
strategies  which  is  cartesian-closed  and  is  enriched  over  CPO,  which  we  denote  CBM. 
Tliere  is  a  full  embedding  of  CA  of  [19]  in  CBM  and  its  extensional  quotient  allows 
interpretation  of  call-by-name  FPC  as  in  the  category  in  [24].  Now  we  say  a  CBM  type 
is  pointed  when  it  has  a  unique  initial  sort  which  is  a  singleton,  just  as  in  CSV.  Let  us 
also  say  a  strategy  in  CBM  is  linear  when,  after  the  initial  question  at  the  codomain,  it 
immediately  asks  the  question  at  the  domain,  and  never  asks  an  initial  question  at  the 
domain  again.  Writing  CBM\  for  the  subcategory  of  CBM  of  pointed  types  and  linear 
strategies,  the  embedding  result  says  (i)  CBM  is  isomorphic  to  the  full  subcategory  of 
CBVx.  of  pointed  types,  and  (ii)  CBV  is  isomorphic  to  the  full  subcategory  of  CBM\  of 
pointed  types  whose  initial  questions  Justify  no  questions.  The  proof  is  by  the  translation 
of  information  flow.  See  [18]  for  details. 

Next  we  discuss  how  we  would  extend  the  full  abstraction  result  in  Section  5  to 
other  call-by-value  programming  languages.  Firstly  it  is  straightforward  to  extend  the 
argument  in  Section  5  to  PCFy  with  sums  and  products  or  to  the  untyped  call-by-value 
A-calculus.  Recursively  typed  languages  such  as  FPC  [15]  can  also  be  handled  {though 
the  premonoidal  tensor  in  CBV  poses  a  problem),  as  observed  by  Fiore  and  as  will  be 
reported  elsewhere.  For  the  interpretation  of  imperative  constructs,  we  would  consider, 
as  noted  in  Introduction,  variants  of  the  present  universe  by  changing  parameters  of 
games  following  [4,  21],  which  does  lead  to  coherent  semantic  universes.  One  interesting 
topic  in  this  context  would  be  whether  one  needs  refined  type  structures  as  in  [4]  for 
the  interpretation  of  the  impure  constructs:  indeed  a  much  simpler,  and  more  direct, 
approach  seems  possible  in  the  present  setting.  Some  results  on  these  topics  will  be 
reported  elsewhere. 

6.2.  Related  works.  After  completing  the  full  version  of  this  paper  [18],  the  authors 
were  informed  of  an  independent  (and  essentially  concurrent)  work  by  Riecke  and  Sand- 
holm  [38]  in  which  they  obtained  a  full  abstraction  for  call-by-value  FPC  (which  easily 
implies  that  of  PCFy).  The  construction  is  based  on  Kripke  logical  relations  on  pCPO, 
and  is  thus  quite  different  from  the  present  one.  No  quotienting  is  necessary  to  reach 
the  semantic  universe,  while  the  construction  of  the  universe  itself  is  substantially  more 
complicated.  In  a  brief  comparison,  one  may  say  that  their  approach  would  give  better 
insights  for  understanding  why  some  (continuous)  function  is  not  sequential;  while  their 
construction  does  not  directly  model  the  dynamic  aspects  of  sequential  call-by-value 
computation,  thus  may  not  lead  to  the  insights  in  that  context.  Thus  two  methods 
would  play  different  roles  in  semantic  analysis. 

In  game  semantics,  Abramsky  and  McCusker  are  working  on  game  semantics  on  call- 
by-value  languages,  based  on  McCusker’s  early  idea  and  also  suggested  by  the  present 
work,  which  tries  to  extract  call-by-value  strategies  from  the  universes  of  call-by-name 
games  in  [24,  4]  (personal  communication).^  In  another  vein,  Harmer  and  Malacaria 
are  working  on  game  semantics  for  call-by-value  computation  based  on  games  originally 
introduced  in  [3].  [16]  gives  a  preliminary  study  in  this  direction. 

6.3.  Intensionality  and  relationship  with  process  theories.  The  strongly  inten- 
sional  character  of  CBV  is  not  at  the  same  level  of  abstraction  as,  say,  pCPO.  The  same 
can  be  said  about  its  call-by-name  counterpart  and  other  categories  of  games,  in  the  sense 

the  final  stage  of  preparation  of  this  camera-ready  version,  we  obtained  their  typescript  [5]. 
which  exploits  the  type  structures  of  the  original  universe  in  [4]  to  interpret  a  functional  language 
with  a  certain  imperative  feature.  Detailed  discussions,  especially  the  comparison  with  an  approach  we 
mentioned  in  6.1.  should  be  left  for  a  future  occasion. 
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that  they  reflect  some  notion  of  execution,  albeit  abstractly,  cf.  [9,  19].  From  the  view¬ 
point  that  the  primary  purpose  of  semantic  representation  of  programming  languages 
lies  in  giving  (in)equations  over  programs  as  general  as  possible,  this  feature  may  be  con¬ 
sidered  as  a  drawback.  However  we  can  take  a  different  perspective,  and  ask  whether 
this  novel  way  of  representing  programs  can  be  put  to  a  significant  use,  especially  once 
given  the  full  abstraction  result  as  the  semantic  justification  of  the  representation.  As 
a  first  such  step,  one  may  exploit  the  representation  for  the  development  of  abstract 
theory  of  execution,  including  the  formal  optimisation  techniques.  Type  structures  as 
we  studied  in  Section  4  may  be  put  to  an  effective  use  in  this  context.  One  interest  in 
this  regard  is  that  our  interpretation  of  PCFy  in  CSV  already  gives  a  concise  abstract 
implementation  of  the  language  in  the  form  name  passing  processes.  The  representation 
is  comparable  to  Milner’s  direct  encoding  in  [27],  performing  the  /9y-reduction  by  three 
name  passing  interactions.  Such  a  ‘‘physical”  character  of  the  abstract  universe  suggests 
we  may  study  the  execution  of,  say,  call- by- value  programming  languages  from  a  new 
level  of  mathematical  abstraction  (this  is  in  line  with  Girard’s  studies  on  the  semantics 
of  cut  elimination  [14]).  Relatedly  the  induced  encodings  also  suggest  the  possibility  of 
relating  game  semantics  and  process  theories  at  the  fundamental  level.  The  study  of 
behavioural  types  by  Milner  [28]  may  suggest  possible  directions  (from  which  the  present 
study  actually  started). 
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Appendix:  PCFv 


We  give  a  brief  review  of  syntax  and  operational  semantics  of  the  call-by-value  PCF 
[15,  42.  40]:  our  treatment  is  nearest  to  [15].  Given  an  infinite  set  of  variables,  ranged 
over  by  x,y,z,.,.,  the  syntax  of  the  language  is  given  as  follows. 

a  ::=  L  \  o  \  M  x  \  Xx^.M  |  MM  \  cond  L  Mi  M2  \  fxx^^^.M  \  c 

where  c  is  a  constant.  An  environment  is  a  list  of  pairs  of  a  variable  and  a  type,  where 
all  variables  are  distinct,  ranged  over  by  T,  A, ...  The  typing  rules  of  PCFy  is  given  as: 

c  is  a  constant  of  type  a  F  >  ilf  :  a  =>  13  T  >  N  :  a 


T,  X  :  ct,V  >  X  :  a 

T,x  :  a>  M  :  (3 
r>Xx°‘.M  :a=>0 


T  >c  :  a 

T>  L  :  o  V  >  M  :  a  T>  N  :  a 
T  >  cond  L  M  N  :  a 


V  >  MN  :  3 

T,x  :  a  =>  13  >  M  :  a  =S  l3 
T  >  fix.M  :  a  ^  /3 


As  a  set  of  constants,  we  assume:  n  :  t  for  each  numeral  n,  Q  :  a  for  each  a,  succ  :  l  =>  l, 
and  zero?  :  l  =>  o.  Terms  of  form  >M  :  a  (often  written  M  :  a)  are  called  closed  terms. 
x4bstractions  and  constants  except  Q  are  called  values. 

On  the  set  of  terms  we  define  an  evaluation  relation  IJ.  in  the  style  of  natural  semantics. 

^  .  M  Ij-  Xx.Mq  N  Mo{V/x}  Ij.  U  M{fjLx.M/x}  V  M  n 

^  ^  ^  MN  -il  U  fix.M  Ij.  V  succ  M  n  +  1 

4/  ]].  0  M  +  I  L  U-  true  Mi  ij-V  XjJ.  false  M2^U 

zerolM  Ij-  true  zeroTM  Jj-  false  cond  L  Mi  M2  -U-  V  cond  L  Mi  M2  Ij-  U 
Finally  an  observational  preorder  on  closed  terms  is  defined  as  follows:  M  ^obs  N  iff, 
for  any  well-typed  context  of  a  program  type  C[-],  we  have  C[M]  -IJ-  n  iff  C[:V]  fj.  n.  We 
note  that  this  is  the  same  thing  as  considering  convergence  at  all  types,  a  situation  quite 
different  from  the  ca.se  of  call-by-name  evaluation. 
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Abstract.  We  prove  that  confluence  and  strong  normalisation  are  both  modular  propenies  for  the  addi¬ 
tion  of  algebraic  term  rewriting  systems  to  Girard’s  equipped  with  either  /3-equality  or  /37/-equality. 
The  key  innovation  is  the  use  of  rj-expansions  over  the  more  traditional  7/-contractions. 

We  then  discuss  the  difficulties  encountered  in  generalising  these  results  to  type  theories  with  dependent 
types.  Here  confluence  remains  modular,  but  results  concerning  strong  normalisation  await  further  basic 
research  into  the  use  of  //-expansions  in  dependent  type  theory. 


1  Introduction 

A  property  P  is  modular  for  the  combination  of  rewrite  systems  71  and  72  iff  whenever  both  7i 
and  T2  satisfy  P,  then  so  does  the  combined  rewrite  system  7i  U  72-  This  paper  studies  the  modu¬ 
larity  of  confluence  and  strong  normalization  for  combinations  of  higher  order  lambda  calculi  and 
algebraic  term  rewriting  systems.  That  is,  does  the  addition  of  a  confluent  algebraic  TRS  to  a  higher 
order  lambda  calculus  (with  or  without  rewrite  rules  for  7?-conversion)  produce  a  system  which  is 
still  confluent?  Similarly,  is  the  combination  of  a  strongly  normalising  algebraic  TRS  and  a  higher 
order  lambda  calculus  (again,  with  or  without  rewrite  rules  for  /^-conversion)  still  SN?  And  do 
these  results  generalise  to  dependent  type  theories  such  as  the  Calculus  of  Constructions?  These 
questions  are  important  from  both  a  theoretical  point  of  view,  where  one  looks  for  general  results 
on  combination  of  rewriting  systems,  and  from  a  practical  point  of  view,  when  one  develops  higher 
order  semi-unification  algorithms,  or  establishes  the  formal  properties  of  algebraic-functional  lan¬ 
guages. 

Tannen  [9]  showed  that  strong  normalization  and  confluence  are  both  moldular  properties  for 
the  combination  of  algebraic  TRS’s  with  the  simply  typed  lambda  calculus  equipped  with  p- 
reduction.  Gallier  and  Tannen  [10,  11]  extended  these  results  to  System  F.  Although  strong  nor¬ 
malisation  remains  modular  in  these  type  theories  if  we  work  with  both  p-  and  //-reductions,  con¬ 
fluence  is  no  longer  a  modular  property.  For  example,  if  s  is  a  base  type  with  constants  /  :  5  ->  5 
and  *  :  s  and  with  a  rewrite  rule  fx  then  is  confluent.  However,  the  combination  of 

with  the  contractive  //-rewrite  rule  fails  to  be  confluent:  Xx.*  4=  Xx.fx  ^  f.  Because 
of  these  problems  with  //-contractions,  later  research  was  restricted  to  adding  more  expressive 
TRSs  to  systems  equipped  only  with  y^-reduction.  In  particular,  translations  into  intersection  type- 
assignment  systems  [3, 29, 26, 6, 5, 7, 4]  were  used  to  prove  the  modularity  of  strong  normalisation 
and  completeness,  i.e.  the  property  of  strong  normalisation  and  confluence  together,  with  conflu¬ 
ence  following  from  strong  normalisation  by  Newman’s  lemma.  As  far  as  the  authors  are  aware, 
modularity  of  confluence  alone  was  not  pursued  any  further  and  no  attempts  were  made  to  study 
modularity  results  for  calculi  equipped  with  /37/-equality. 

This  paper  extends  the  works  of  Tannen  and  Gallier  in  several  ways.  Firstly,  we  shall  consider 
more  expressive  calculi  such  as  Girard’s  and  Coquand  and  Huet’s  Calculus  of  Constructions, 
henceforth  denoted  CoC.  We  show  that  confluence  is  modular  for  the  combination  of  algebraic 
TRS’s  with  these  calculi  (without  //-conversion).  As  mentioned  earlier,  these  results  are  surpris¬ 
ingly  missing  in  the  literature.  Our  second  contribution  is  to  extend  these  modularity  results  to 
calculi  equipped  with  /^//-equality.  This  is  done  by  replacing  the  problematic  interpretation  of  //- 
conversion  as  a  contractive  rewrite  relation  with  its  more  recent  interpretation  as  an  expansionary 
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rewrite  rule.  Eta-expansions  in  the  simply  typed  A-calcuIus  were  first  studied  in  the  70’s  but  only 
recently  they  made  the  object  of  accurate  study  in  a  number  of  papers  [1,  16,  13,  19,  27,  17]  (for 
an  up-to-date  surv-ey,  the  interested  reader  can  refer  to  [15]).  This  paper  relies  on  Ghani’s  recent 
results  on  T?-expansions  in  [23]  and  CoC  [22]. 

2  Extensional  and  Non-extensional 

We  use  the  standard  notions  of  substitutions,  reduction,  normal  form,  confluence,  normalization, 
etc.,  from  the  theory  of  A-calculus  and  rewriting  systems  [8,  14].  The  free  variables  of  a  term  M 
are  denoted  FV{M)  and  we  write  M6  for  the  result  of  applying  a  substitution  9  to  the  term  M. 
The  domain  of  a  substitution  6  is  denoted  dom(0).  If  7^  is  a  rewrite  relation  with  unique  normal 
forms,  then  reduction  to  7^-normal  form  is  denoted  IZ  |  and  the  unique  7?.-normal  form  of  t  is 
denoted  11(1).  Finally,  a  relation  R  commutes  with  S  iff  (R*)~^\S*  C  S*\  {R*)~^  where  ;  is 
the  usual  composition  of  relations.  If  two  confluent  relations  commute,  then  their  union  is  also 
confluent. 

In  this  section,  two  versions  of  F^  will  be  defined.  Extensional  F^  uses  j^T^-equality  for  type 
conversion  while  non-extensional  F^  has  only  ^-equality  for  type  conversion  —  our  presentation 
is  based  on  Gallier’s  [21].  Formally,  let  *  be  a  distinguished  symbol  and  let  TVar  and  Var  be 
disjoint  sets  of  type  variables  and  term  variables.  These  variables  are  used  to  define  the  kinds, 
types  (also  called  type  constructors)  and  terms  of  F^  as  follows: 

(Kinds)  K  :=  *\K  K 
(Types)  T  ;=  t\T-^T\it :  K.T\\t  :  K.T\TT 
(Terms)  M  :=  x\\x  :  T.M\M M\At :  i^.M|M[T] 

where  t  e  TVar  is  a  type  variable  and  rr  6  Var  is  a  term  variable.  A  term  is  called  an  abstraction 
iff  it  is  of  the  form  \x  :  T.M  or  At :  K.M.  In  order  to  ensure  that  types  inhabit  unique  kinds,  we 
assign  to  each  type  variable  t  a  unique  kind  and  denote  the  set  of  type  variables  having  kind  K  as 
TVar  (AT).  This  kinding  information  is  used  to  define  the  kinding  judgements  of  F^  as  follows 

teTVar(K)  s:K2  t  E  TVdLr(Ki)  t  :  Ki K2  s  :  Ki 

TTk  (Xt  :  Ki.s)  :Ki^K2  ts  :  K2 

t  E  TVar(Ar)  s  \  *  t:*  s 

Vi  :  K.s  :  *  t-^s  :  * 

In  order  to  give  the  typing  judgements  of  extensional  F^^  we  define  the  usual  ;S7/-equality  relation 
on  well-kinded  types;  if  two  types  t  and  s  are  Pp-equai,  we  denote  this  by  writing  t  =pr^  s.  The 
following  lemma  is  proved  in  [23] 

Lemma  1.  Py -equality  over  types  can  be  generated  by  a  confluent,  strongly  normalizing  reduction 
relation  containing  p  reduction  and  restricted  q-expansions.  The  unique  normal  form  of  a  type  A 
is  its  long  Pp-normal  form  and  is  denoted  NF(A). 

The  typing  judgements  of  extensional  F''^  are  defined  by  the  following  rules,  while  the  typing 
judgements  of  non-extensional  F^  use  only  /7-equality  for  type  conversion. 

X  :T  E  dom(r)  F  h  M  :  t  t  s  s  :  K 

r\-x:T  Fh  M  :s 

r,x  :  ti  \-  M  :  t2  F  \~  M  :  ti~¥t2  T"  F  N  :  ti 


F\-(Xx:ti.M):ti-^t2 


F  h  MN  :  t2 


239 


r,ti  :  K  \~  M  :  t2  T  h  M  :  Vti  :  K.t2  PhsiK 

r  h  Ati  :  K.M  :  Vii  :  K ,t2  F  h  M[s]  :  t2[s/ti] 

In  the  rest  of  this  paper,  we  confine  our  attention  to  only  those  types  that  kind  check  and  those 
terms  that  type  check.  In  addition,  we  increase  legibility  by  dropping  all  reference  to  the  context 
in  which  a  typing  judgement  occurs  whenever  there  is  no  danger  of  confusion  arising. 


2.1  Eta-expansions  in 

As  argued  in  the  introduction,  any  robust  result  concerning  the  modularity  of  confluence  in  the 
presence  of  77-conversion  requires  its  interpretation  as  an  expansion.  In  the  simply  typed  A-calcuIus, 
one  permits  an  expansion  t  ^  Xx  :  A.tx  providing  that  t  is  neither  a  A-abstraction  nor  applied  to 
another  term.  This  restricted  expansion  relation  is  SN,  confluent  and  its  reflexive,  symmetric  and 
transitive  closure  is  /^r^-equality.  Thus  /Jr^-equality  can  be  decided  by  reduction  to  normal  form  in 
this  restricted  fragment. 

However,  defining  77-expansion  in  requires  further  care  so  as  to  avoid  pitfalls  caused  by 
the  presence  of  multiple  typings  for  terms.  For  instance,  if  an  expansion  M  Xx  :  A.Mx 
is  permitted  providing  M  :  A  B,  then  77-expansion  alone  is  not  even  confluent  as  there  are 
rewrites 

Aa:  :  A'.Mrr  M  Xx  :  A.Mx 


where  we  only  know  that  A  A'  in  the  type-conversion  relation.  Worse,  r7-expansion  defined 
this  way  does  not  have  unique  normal  forms  and  hence  the  usual  strategy  for  computing  long 
normal  forms  (first  contract  (3  redexes  and  then  perform  all  remaining  expansions)  would  no  longer 
be  valid.  For  these  reasons  we  define  a  type  normalised  form  of  77-expansion  as  follows 


M  — Xx  :  A.Mx,  if 


X  fresh 

M  :  A-^C,  with  A-^C  in  type  normal  form 
M  is  not  a  A-abstraction 
M  is  not  applied 


(1) 


Note  that  the  existence  of  type  normal  forms  is  assured  by  lemma  1.  There  is  no  need  for  a  type- 
normalised  form  of  the  higher  order  77-rewrite  rule  because  if  a  term  inhabits  the  types  Vf  :  K.A 
and  'it  :  K'.A',  then  we  must  have  K  =  K' .  Hence  our  higher  order  77-expansion  is: 


M 


{At :  K.M[t])  if 


t  fresh 
M:{it:  K.A) 

M  is  not  a  polymorphic  A-abstraction 
M  is  not  applied 


(2) 


Definition  2.  Let  /?  be  the  rewrite  relation  consisting  of  all  ^-reductions  on  types  and  term.  Also, 
let  rj  be  the  rewrite  relation  consiting  of  all  restricted  expansions  on  types  and  those  expansions 
given  in  rules  1  and  2.  The  relation  77  is  defined  by  ommiting  the  restriction  to  type  normal  forms 
in  rule  1 .  Finally  define  /?^  =  /?  U  77  and  (3rj  —  PUrf. 

Results  such  as  the  modularity  of  confluence  and  strong  normalisation  are  proven  first  for  pfj  and 
then  lifted  to  the  more  general  Pp  via  the  following  lemma. 

Lemma  3.  The  reflexive,  symmetric  and  transitive  closure  of  — ^  and  — ^  are  both  the 
usual  prj-equality  over  terms  of  F^. 


Proof.  Firstly,  ail  rj  equalities  M  —  Xx  :  A.Mx  that  seem  to  be  forbidden  by  the  restrictions  of 
— ^  can  be  obtained  by  ^-reduction  of  Ax  :  A.Mx.  Thus  the  reflexive,  symmetric,  transitive 
closure  of  — ^  is  /?77-equality.  For  the  second  part  of  the  lemma,  notice  that  — ^  -expansions 
are  examples  of  — ^  -expansions.  In  addition,  if  M  — ^  Ax  :  A.Mx,  but  A  is  not  a  type 
normal  form,  then  both  of  these  terms  — ^  -reduce  to  Ax  :  NF(A).Mx. 
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The  major  theofems  concerning  (3t)  and  /??/  are 

Theorem  4.  The  rewrite  relations  (Tq  and  Pq  are  confluent  and  strongly  normalizing  to  the  long 
pq- normal  forms.  The  long  Pq-normal  form  of  a  term  may  be  calculated  by  first  contracting  all 
6-redexes  and  then  performing  any  remaining  Jype-normalised  q-expansions. 


3  Modularity  Results  for 

In  this  section  we  define  algebraic  TRSs  and  show  the  modularity  of  confluence  and  strong  nor¬ 
malisation  for  the  unions  of  algebraic  TRSs  with  First  some  definitions. 

Definitions.  A  signature  E  consists  of  disjoint  sets  T  of  base  types  and  F  of  function  symbols 
together  with  a  function  which  assigns  to  every  function  symbol  f  e  F,  a.  typing  of  the  form 
^  a,  where  ai, . . . ,  an,  a  €  T  and  n  >  0.  We  say  the  arity  of  /  is  n. 

Definition  6.  An  algebraic  rewrite  rule  is  an  ordered  pair  (T,  U)  of  algebraic  terms  such  that  T  is 
not  a  variable,  and  every  variable  of  U  also  appears  in  T.  An  algebraic  term  rewriting  system  T  is 
a  finite  set  {{Ti,  of  algebraic  rewrite  rules. 

Definition?.  Given  an  algebraic  TRS  T,  the  associated  algebraic  rewrite  relation  is  the  least 
binary  relation  — ^  on  terms  such  that  if  (T,  U)  eT,9\sa  substitution  and  C  is  a  context, 
then  C[T9]  C[U9] 

Given  an  algebraic  TRS,  its  union  with  calculi  such  as  is  defined  as  expected.  A  term  of  the 
union  of  an  algebraic  TRS  and  F'^  is  algebraic  if  it  is  either  a  variable  of  base  type  or  has  the  form 
f  where  f  £  Fhas  arity  n,  and  every  ti  is  an  algebraic  term.  Note  that  an  algebraic  term 

is  always  of  base  type.  The  key  concept  in  modular  term  rewriting  is  the  layer  structure,  i.e.  the 
ability  to  decompose  a  term  constructed  from  symbols  in  the  union  of  two  disjoint  signatures  into 
a  term  constructed  from  symbols  in  only  one  signature  and  strictly  smaller  subterms  whose  head 
symbol  comes  from  the  other  signature.  We  follow  [10]  in  using  the  following  defintions  relating 
to  layer  structure. 

Definitions.  A  typing  judgement  T  h  M  :  s  is  called  trunk  iff  M  is  of  the  form  /Mi, 
where  /  is  a  constant  of  arity  k,  otherwise  it  is  called  non-trunk. 

Definition  9.  An  algebraic  trunk  decomposition  of  a  typing  judgement  F  h  M  :  s  consists  of  a 
typing  judgement  A\-  A  :  s,  where  A  is  an  algewbraic  term,  and  a  term-valued  substitution  f 
such  that  M  -  Af,  dom(0)  =  FV{A)  and 

-  Each  free  variable  in  A  occurs  only  once 

-  For  each  x  £  FV{A),  the  typing  judgement  F  h  :  s  is  non-trunk. 

Note  that  all  judgements  T  h  M  :  5  are  either  trunk  or  non-trunk  because  M  is  of  base- 
sort.  Induction  shows  that  all  typing  judgements  F  h  M  :  s  have  algebraic  trunk  decompositions 
which  are  unique  upto  the  renaming  of  the  free  variables  of  A.  We  therefore  write  M  =  A[^]  for 
an  algebraic  trunk  decomposition  of  M  and  refer  to  A  as  a  trunk  of  the  term  M. 

Example  1.  If  /  is  a  binary  function  symbol  and  a  is  a  non-trunk  term,  then  a  trunk  decomposition 
for  the  term  faa  is  fxylajx.,  a/yj.  If  y  is  a  unary  function  symbol  and  a  is  a  constant,  then  a  trunk 
decomposition  ofy((Aa:  :  5.2:)(a))  is  yy[(Ax  :  s.x){a)ly] 

Definition  10.  A  reduction  M  =  A[4>]  — ^  iV  is  a  trunk  reduction  iff  the  redex  contracted  is  not 
a  subterm  of  one  of  the  0(rz;)’s,  otherwise  it  is  a  non-trunk  reduction. 
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Example  2.  Using  the  terms  of  example  1,  and  given  a  rewrite  rule  fxx  x,  there  is  a  trunk 
reduction  faa  a.  There  is  a  non-trunk  reduction  g{(Xx  :  s.x){a))  ^  ga. 

Example  2  show  two  undesirable  properties  of  reduction.  Firstly,  the  presence  of  non-left  linear 
rewrite  rules  means  that  trunk  reductions  do  not  induce  reductions  of  the  trunk  of  the  redex.  For 
instance  faa  a  but  there  is  no  reduction  fxy  “ ^  x.  Also  /^-reduction  may  collapse  the 
layer  structure  of  a  term  and  hence  a  non-trunk  reduction  need  not  preserve  the  trunk  of  the  redex, 
eg  the  trunk  of  g{{Xx  :  $.x){a))  is  gy  but  the  trunk  of  ga  is  ga.  We  solve  the  first  problem  by 
introducing  a  special  term  variable  f  for  each  sort  and  then  defining  a  special  substitution  j  which 
maps  every  term  variable  of  type  siof.  There  is  also  solution  for  the  second  problem. 

Lemma  11.  Let  Acf)  be  a  trunk  decomposition  for  M 

-  If  M  N  is  not  a  trunk  reduction,  then  there  is  an  algebraic  trunk  decomposition  N  - 

Af  such  that  for  some  x  e  FV{A),  (l){x)  0'(x),  while  for  all  other  y  G  FV[A). 

-  If  M  ^  ^  N  is  a  trunk  reduction,  then  there  is  an  algebraic  trunk  decomposition  N  ~  A' f 
such  that  A]  — ^  A'j  and  for  every  y  6  FU(A'),  there  exists  an  x  e  FV{A)  such  that 

(p'iy)  = 

If  M  N,  then  there  is  an  algebraic  trunk  decomposition  N  =  A' 4>'  and  for  every  y  6 
FV{Af,  there  exists  an  x  ^  FV{A)  such  that  either  <p{x)  Nx  and  (j)  {y)  is  a  subterm 
ofNx,  or<l){x)  =  0'(y) 

Proof  The  lemma  is  proved  by  induction  on  the  term  M. 


3.1  Modularity  of  Confluence 

The  proof  strategy  of  [1 1]  is  used  to  show  the  modularity  of  confluence  for  the  combination  of 
algebraic  TRSs  with  both  extensional  and  non-extensional  F^.  In  particular,  reduction  to  long 
/3r/-normal  form  in  commutes  with  algebraic  reductions. 

Lemma  12.  IfT  is  a  confluent  algebraic  rewriting  system  (over  algebraic  terms),  then  it  is  con¬ 
fluent  over  the  terms  of  F^  UF  ( mixed  terms). 

Proof  This  proof  of  [11]  generalises  to  F'^  and  CoC  because  the  only  property  required  of  mixed 
terms  is  that  the  trunk  of  a  term  is  preserved  by  non-trunk,  algebraic  reductions,  as  proven  in 
lemma  1 1. 

Lemma  13.  Reduction  to  normal  form  commutes  w.r.t.  algebraic  reduction,  i.e. 


r 


Proof  See  lemma  31  in  the  appendix  for  the  proof. 

These  lemmas  allow  us  to  derive  our  first  modularity  result,  namely  that  of  confluence  for  the 
addition  of  algebraic  TRSs  to  non-extensional  F'^.  This  is  a  new  result  as  it  shows  modularity  of 
confluence  alone,  and  not  of  confluence  and  strong  normalization  together  as  in  [7]: 

Corollary  14.  The  union  of  non-extensional  F'^  with  a  confluent  algebraic  TRS  is  confluent. 
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Proof.  By  lemma  t3,  if  t  =/jur  then  (3{t)  —7-  By  lemma  12,  T  is  confluent  over  mixed 
terms.  Hence  P(t)  and  I3{t')  have  a  common  T-reduct  and  hence  t  and  P  have  a  common  reduct. 

Proving  that  confluence  is  modular  for  the  addition  of  algebraic  TRSs  to  extensional  re¬ 
quires  us  to  relate  algebraic  rewriting  to  expansive  normal  forms,  extending  [17]: 

Lemma  15,  Reduction  to  rj  normal  form  commutes  w.r.t.  algebraic  reduction,  i.e. 
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Proof  The  proof  is  by  induction  on  the  structure  of  terms.  The  fact  that  the  77  normal  form  of  a 
term  is  unique  is  necessary  for  the  lemma  to  hold  with  arbitrary  TRSs  and  not  only  left-linear  ones. 

As  a  consequence  of  the  previous  lemmas,  we  have  the  following 

Corollary  16.  Reduction  to  /dfj  normal  form  commutes  with  algebraic  reduction,  i.e. 
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Proof  By  theorem  4,  the  long  /777-normal  form  of  a  term  can  be  computed  by  first  contracting 
all  /?-redexes  and  then  performing  any  remaining  (restricted)  77-cxpansions.  Thus  the  corollary 
follows  from  lemma  13  and  lemma  15. 


Theorem  17.  The  union  offTr]  with  a  confluent  algebraic  TRS  is  confluent. 

Proof  As  in  corollary  14  using  corollaiy  16. 

Corollary  18.  The  union  of  jSr}  (where  77  is  not  restricted  to  type  normal  forms)  with  a  confluent 
algebraic  TRS  T  is  confluent. 

Proof  If  two  terms  are  T  flr]  equivalent,  they  are  T  U  /Tt?  equivalent  and  hence  by  theorem  17 
there  is  a  T  U  completion  for  these  terms.  But  this  is  also  a  T  U  /?77  completion. 


3.2  Modularity  of  Strong  Normalization 

The  relations  (Tfj  and  flr}  were  proved  confluent  and  SN  in  [23]  by  a  modified  reducibility  ar¬ 
gument,  adapted  from  traditional  reducibility  proofs  to  cope  with  the  presence  of  expansionary 
77-rewrite  rules.  Reducibility  arguments  are  designed  to  cope  with  the  higher  order  features  at  the 
level  of  kinds  and  type  constructors,  while  the  effect  of  adding  algebraic  TRSs  is  only  felt  at  the 
level  of  base  types.  Thus  these  reducibility  arguments  generalise  to  prove  the  modularity  of  strong 
normalisation  for  the  combination  of  algebraic  TRSs  with  extensional  F^. 

Lemma  19.  IfT  is  a  SN  algebraic  TRS,  then  its  extension  to  F'^  is  also  SN. 

Proof  The  lemma  is  proved  by  induction  on  the  structure  of  terms  with  the  only  interesting  case 
being  a  trunk  term  M  —  Af.  By  lemma  1 1 ,  any  infinite  reduction  sequence  of  M  induces  either  an 
infinite  reduction  sequence  of  a  fix),  or  an  infinite  reduction  sequence  of  Aj.  The  first  possiblity 
is  impossible  by  the  induction  hypothesis,  while  the  second  possibility  is  also  impossible  as  T  is 
SN  on  algebraic  terms  and  Aj  is  algebraic. 


243 


We  now  prove  the  main  result  of  this  section,  namely  that  the  union  of  a  SN  algebraic  TRS 
and  ^-reduction  in  is  SN.  The  proof  follows  the  modified  reduciblity  argument  of  [23]  and 
thus  we  only  sketch  the  general  reducibility  argument  and  concentrate  instead  on  the  particular 
novelties  which  arise  via  the  addition  of  algebraic  TRSs.  One  defines  a  notion  of  reducibility 
candidate  and  reducibility  parameter  exactly  as  in  [23]  and  proves  that  if  T  is  a  type  and  0  is  a 
reducibility  parameter,  then  T9  is  a  reducibility  candidate.  The  only  new  case  is  when  T  is  a  sort 
s  and  here  the  reducibility  candidate  sO  is  defined  to  be  the  SN  terms  of  type  s.  The  following  pair 
of  lemmas  are  the  key  to  completeing  the  proof. 

Lemma  20.  If  the  terms  ti,...,inare  SN,  then  so  is  fti...  tn- 

Proof  That  there  are  no  infinite  fdfj  reduction  sequences  is  proved  in  [23].  By  corollary  16,  a 
rewrite  fti . .  .1^  —  M  ^  N  induces  a  sequence  of  rewrites  Mq  Nq  where  Mq  and  Nq 
are  the  long  /3r/-normal  forms  of  M  and  iV.  Close  inspection  of  the  proof  shows  that  if  the  initial 
rewrite  is  of  the  trunk,  then  this  induced  rewrite  sequence  is  of  length  at  least  one.  Hence  there 
can  be  no  infinite  reduction  sequences  containing  an  infinite  number  of  trunk  rewrites.  By  lemma 
1 1 ,  all  other  infinite  reduction  sequences  of  fti ..  .tn  induce  infinite  reduction  sequences  of  one 
of  the  terms  U  which  is  prohibited  by  assumption. 

Lemma  21.  IfU  is  a  SN  term  of  sort  Sifori  =  1,”^  ,m,  and  f  has  type  si  ^  Sn  where 

m  <  n,  then  is  reducible. 

Proof  The  proof  is  by  induction  on  the  type  of  the  term  /fi . . .  im-  If  this  type  is  a  sort,  then 
we  must  show  that  ft\...  tm  is  SN  under  the  assumption  that  each  of  the  U  are  SN.  But  this  is 
precisely  lemma  20.  If  however  the  type  of  fti ...  is  of  the  form  s  ->  T,  then  we  must  show 
that  if  t  is  a  reducible  term  of  type  s,  then  fti ...  tmt  is  reducible.  Since  the  reducible  terms  of 
type  s  are  exactly  the  SN  ones,  this  follows  from  the  induction  hypothesis. 

Lemma  22,  IfT  is  a  SN  algebraic  TRS,  then  so  are  (3fj\JT  and  /?  U  T. 

Proof  Having  defined  reducibility  candidates  as  in  [23],  the  proof  concludes  by  showing  that  if  t 
is  an  arbitrary  term,  0  is  a  reduciblity  parameter,  the  free  term  variables  of  t  are  among  Xj  :  Tj  and 
Uj  are  members  of  the  reducibility  candidate  Tj9,  then  i[|0|]  [uj/xj]  is  a  member  of  the  reducibility 
candidate  T9  (note  \9\  is  the  type-valued  substitution  underlying  the  reduciblity  parameter  9). 

The  only  new  case  is  when  t  is  of  the  form  /f  i . . .  tn  and  one  must  show  {fti . . .  tn)[\9\][ujlxj\ 
is  reducible  when  each  of  the  terms  is  reducible.  But  this  follows  from  lemma  21. 

Strong  normalisation  of /?^UT  follows  by  taking  the  identity  substitution  and  identity  reducibility 
parameter,  while  strong  normalisation  oi  (d\JT  follows  as  this  is  a  subrelation  oi  ^\JT. 

There  is  a  simple  trick  to  extend  strong  normalisation  of  /Jr;  U  T  to  U  T.  If  i  is  a  term, 
let  TNF(i)  be  the  type  normal  form  of  t,  ie  the  term  that  is  obtained  by  normalising  all  the  types 
occuring  as  subterms  and  in  A-abstractions  in  t.  A  reduction  t  — ^  t'  is  called  type  induced  iff 
the  redex  contracted  occurs  inside  a  subterm  of  t  which  is  actually  a  type. 

Lemma  23.  If  there  is  a  rewrite  t  then  there  is  a  rewrite  TNF(i)  ^TNF(f^).  If  the 

original  rewrite  is  not  type  induced  then  the  final  rewrite  sequence  is  not  of  zero  length. 

Proof.  The  lemma  is  proved  exactly  as  in  [23] 

Corollary  24.  IfT  is  a  SN  algebraic  TRS,  then  PqUT  is  also  SN. 

Proof  There  are  no  infinite  sequences  of  type  induced  reductions  because  reduction  on  types  is 
SN.  In  addition,  if  t  i'  is  type  induced,  then  TNF(i)  =  TNF(i').  Thus  any  infinite  /???  U  T 
reduction  sequence  is  mapped  by  type  normalisation  to  an  infinite  j3rj  U  T  reduction  sequence. 
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4  Modularity  for  Algebraic  TRS  and  CoC 

We  have  proven  a  series  of  modularity  results  concerning  the  addition  of  algebraic  TRSs  to  F^. 
The  next  logical  step  is  to  apply  the  same  ideas  to  the  much  more  powerful  Calculus  of  Con¬ 
structions  [12].  Due  to  lack  of  space,  we  cannot  introduce  it  here  in  detail,  but  we  recall  that  the 
most  important  feature  is  that  the  distinction  between  types  and  terms  is  blurred  and  types  can 
contain  terms  embedded  within  them;  let  (3  and  rj  refer  to  the  Calclulus  of  Constructions  rules 
in  this  section.  Type  dependency  introduces  infinite  reduction  sequences  which  are  not  present  in 
non-dependent  type  theories.  For  example,  if  we  define  expansions  by 

r\-t'.nx'.A.B 

r  t  ^  Xx  :  A.tx 

and  define  the  term  B{x)  =  (Xz  :  X  -i-  X.X){x),  then  there  is  a  typing  judgement  X  :  *,x  : 
X  ^  X  X  :  IIz  :  B{x).X  and  hence  an  infinite  reduction  sequence 

X  :  *,x  \  X  ^  X  X  ^  Xz  B{x).xz  =>  Xz  :  B(Xz  :  B{x).xz).xz  ^ 

Notice  that  this  example  does  not  use  any  higher  order  types  and  so  can  be  formulated  in  simpler 
dependent  type  theories  such  as  LF.  The  existence  of  infinite  reduction  sequences  such  as  the  one 
above  forces  us  to  restrict  our  attention  to  a  type  normalised  form  of  restricted  77-expansion  which 
we  again  denote  by  ff.  Further,  let  /Sf]  be  the  rewrite  relation  containing  all  ^S-reductions  and  type 
normalised  restricted  expansions  and  (St]  be  defined  as  in  /Jr?  but  without  the  type  normal  form 
requirement. 

In  the  existence  of  type  normal  forms  is  easy  to  prove  as  reduction  at  the  level  of  types  is 
defined  independently  to  reduction  at  the  level  of  terms.  However  in  a  dependent  type  theory  such 
as  CoC  the  existence  of  long  /077-normal  forms  is  much  harder  to  prove.  One  can  either  use  the 
standard  theory  of  77-contractions  as  in  [20]  or  prove  their  existence  while  simultaneously  devel¬ 
oping  the  theory  of  expansions  as  in  [22],  The  following  lemma  is  proved  in  [22]  -  we  conjecture 
that  j3fj  is  actually  SN  but  a  proof  awaits  further  research. 

Theorem  25.  and  are  confluent  and  weakly  normalising  to  the  long  flr^-normal  forms. 

4.1  Modularity  of  Confluence 

As  we  have  described  above,  the  theory  of  strong  normalization  for  77-expansions  in  Coc  is  not 
settled.  Nevertheless,  we  can  use  confluence  and  weak  normalization  of  (dfj  to  good  avail  and  get 
the  modularity  of  confluence  for  the  union  of  algebraic  TRSs  with  CoC. 

Lemma  26.  Algebraic  reduction  commutes  with  (3-normalization  in  CoC. 

Proof  As  in  [11].  Again,  see  lemma  31. 

Corollary  27.  IfT  is  a  confluent  algebraic  TRS,  then  (3  \JT  is  also  confluent 
Proof.  As  in  corollary  14  and  using  lemma  26 

Proving  that  confluence  is  modular  for  the  union  of  algebraic  TRSs  with  extensional  CoC 
requires  another  commutation  lemma. 

Lemma  28.  Algebraic  reduction  commutes  withrj-normalisation. 

Proof  Similar  to  lemma  15. 

Corollary  29.  IfT  is  a  confluent  algebraic  TRS,  then  (3fj\JT  and  flr]\JT  are  also  confluent. 

Proof.  pfiUT  IS  proven  confluent  by  a  similar  argument  to  theorem  17  using  the  commutation 
lemmas  26  and  28.  The  confluence  of  pr)UT  is  proved  as  in  corollary  18. 


245 


5  Conclusions  ■ 

We  have  proved  a  variety  of  modularity  results  for  the  combination  of  algebraic  TRSs  with  higher 
order  typed  A-calcuIi.  In  generalising  the  previous  results  in  the  literature,  our  key  innovation  is 
the  use  of  7;-expansions  instead  of  the  more  problematic  77-contractions. 

There  are  several  directions  in  which  we  wish  to  persue  this  research.  Most  importantly  we 
want  a  modularity  result  for  strong  normalisation  for  the  addition  of  algebraic  TRSs  to  CoC.  As 
we  remarked  in  the  paper,  this  research  awaits  further  basic  research  into  the  use  of  77-expansions 
in  CoC.  In  particular  we  conjecture  that  (3fj  is  SN  and  we  further  conjecture  that  the  combination 
of  a  SN  algebraic  TRS  with  ^  remains  SN. 
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A  Commutation  of  algebraic  reduction  with  reduction  to  /3  or  Coc  normal  form. 

In  this  section  we  simply  reformulate  lemma  4.1  of  [11]  in  the  framework  of  non  extensional 
and  Coc.  It  is  to  be  noticed  that  there  is  really  nothing  new  in  the  proof,  as  the  clever  argument 
used  in  that  lemma  is  tight  enough  to  only  involve  the  first  order  fragment  of  the  caculi,  so  that 
extensions  to  other  calculi  is  straightforward. 

In  the  following,  let  A  — ^  B  be  an  algebraic  rewrite  rule,  with  s  being  the  sort  of  the  al¬ 
gebraic  term  A  (and  B)  and  —  xi  :  :  Sn  —  FV{A)  U  FV{B)  with  the  Si  s 

being  the  sorts  of  the  variables  used  in  the  algebraic  rule.  Let  also  2:  be  a  chosen  variable  of  type 
^  s.  We  also  suppose  a  given  typing  and  kinding  context  that  we  omit  for  read¬ 

ability. 

We  say  that  a  term  has  the  z-algebraic  property  if  all  occurrences  of  the  variable  z  in  it  are 
fully  applied,  i.e.  at  the  head  of  a  subterm  zPi . . .  Pn  that  possesses  the  type  5  with  all  the  Pfs 
possessing  the  type  Si.  This  property  is  clearly  inherited  by  subterms. 

The  central  property  which  is  needed  is  the  following  (where  by  /3  -  n.f.  we  mean  reduction 
to  n.f.  only  w.r.t.  the  first  order  rule  /?  while  F^^  (resp.  Coc)-n.f.  is  w.r.t  the  full  non  extensional 
reduction  system,  which  we  will  also  call  full  normal  form): 

Proposition  30.  If  Z  is  an  F^  (resp.  Coc)  normal  form  having  the  z-algebraic  property,  then 
X  =13-  n.f.{Z[\^  :  -f.A/z])  and  Y  =  (3  -  n./.(Z[A^  :  ^.B/z]) 
are  F‘^  (resp.  Coc)  normal  forms  and  moreover  X  — ^  Y . 

Proof  This  is  by  induction  on  the  size  of  Z.  Since  Z  is  a  normal  form,  it  must  be  of  the  shape 
Xvi . . .  Vk-hTi  ...Tm  with  Vi  being  either  a  term  variable  xi  :  Si  with  Si  a  normal  form,  or  a  type 
variable  ti  :  K. 

We  have  now  two  cases: 
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h^z  then  X  =  13-  n.f.{Z[X^  :  -t.A/z])  =  X-^.hT^ ...  and  7  =  0-  n.f.(Z[X-^  : 
-f.B/z]}  ^Xlf.hTB  ...T^  withT;'^  =  0 -n.f.(Ti[X-t  :  and  T®  =  ^- 

:  'f.BIz]).  ButT^  is  still  a  full  normal  form,  of  size  strictly  smaller  than  Z  (as  at 
least  h  is  removed),  and  it  still  possesses  the  z~algebraic  property  as  it  is  a  subterm  of  Z.  So, 
by  induction  hypothesis,  Tf"  is  a  full  normal  form  and  ^  hence  X  is  a  full  normal 

form  and  X  Y. 

h  =  z  In  this  case,  fc  =  m  and  we  have  that 

Z[Xlt  :  -f.Alz\)  =  AV.(A^  :  -f  .A)Ti . . .  AV.71[Ti/ii  . . .  T,„/x„] 


and 


Z[Xlt :  -f.BIz])  =  Xlf.(X-t  :  -^.S)Ti  ...Tm  Alf.B[Ti/xi . . .T„/a:„l 

Then,  since  no  /^-reduction  can  take  place  at  the  junction  points  of  the  Ti  with  A,  as  they 
have  as  type  a  base  sort,  X  =  0-  n.J.(Z[X-t  ;  -f.Alz\)  =  Al^.A[T/'/xi . . ,  T^lx„]  and 
Y  =  0  -  n.f.(Z[X-t  ;  -f.BIz])  =  AV'.B[Ti®/xi  .  •  .T®/x„].  As  above,  the  (resp. 
T^)  are  smaller  normal  forms  than  X  (resp.  Y),  so  by  induction  hypothesis  we  have  that  the 
and  Tf  are  full  normal  forms  and  that  T/'  — ^  Tf.  Then,  both  X  and  T  are  full  nor¬ 
mal  forms  and  moreover  X  ~  \l^.A[T{^/xi . . .  T^/xn]  Alt .A[T^ /xi . . ,  T^/x„] 

Alt.B[Tf/a:i...T^/x„].  Wearedone. 

Using  this  crucial  result  it  is  then  quite  easy  to  show  the  equivalent  of  Lemma  4.1  of  [1 1]: 

Lemma  31.  Let  A  B  be  an  algebraic  rewrite  rule.  IfM  — N,  then  fnf  (M)  — ^  f  nf  {N), 
where  fnf{M)  is  the  full  non-extensional  normal  form  w.r.t.  or  Coc. 

Proof  IfM  N,  then  M  =  C[A4>]  and  N  =  CIBf]  with  (/>  a  substitution  [Pi/xi, . . . ,  Pn/xn] 
Then,  for  a  suitable  variable  z  of  type  si  s„  ->  s,  we  can  write  terms 

M'  =  C[zPi . . .  Pn][X^  :  -t.A/z]  and  N'  =  C[zPi . . .  Pn][X^  :  -f.B/z] 

s.t.  M'  — ^  M  and  N'  N.  Now,  C[zPi . . .  Pn]  bas  the  z-algebraic  property,  and  since 
this  property  is  preserved  by  the  non-extensional  and  Coc  reductions,  also  fnf  {C[zPi . . .  Pn]) 
has  it. 

Now,  we  can  apply  the  previous  theorem  to  such  a  full  normal  form  and  obtain  that  M”  ~ 
P  -  n.f.{fnf{C[zPi . . .  PnDlA*^  :  -f  .A/ z])  and  N”  =  p  -  n.f.{fnf(C[zPi . . .  Pn])[A^  : 

~t.B/z])  are  full  normal  forms  and  that  M"  “ ^  N”.  Since  M' - ^  M”  (resp.  N' - ^  N”) 

and  M'  — ^  M  (resp.  N’  — ^  N),  we  have,  due  to  confluence  of  F'^  and  Coc,  that  M”  = 
fnf{M)  and  N”  =  fnf{N),  and  we  are  done. 


On  Explicit  Substitutions  and  Names 
(Extended  Abstract) 
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Abstract.  Calculi  with  explicit  substitutions  have  found  widespread  ac¬ 
ceptance  as  a  basis  for  abstract  machines  for  functional  languages.  In 
this  paper  we  investigate  the  relations  between  variants  with  de  Bruijn- 
numbers,  with  variable  names,  with  reduction  based  on  raw  expressions 
and  calculi  with  equational  judgements.  We  show  the  equivalence  be¬ 
tween  these  variants,  which  is  crucial  in  establishing  the  correspondence 
between  the  semantics  of  the  calculus  and  its  implementations. 


1  Introduction 

Explicit  substitution  calculi  (or  Acr-calculi  for  short)  first  appeared  in  a  seminal 
paper  by  Abadi  et  al.  [1].  The  basic  idea  is  that  instead  of  having  substitutions 
as  a  meta-level  operation,  as  in  traditional  A-calculus,  we  should  make  them 
part  of  the  object-level  calculus.  The  advantages  of  this  approach  are  twofold. 
Firstly,  it  makes  it  possible  to  design  much  more  efficient  abstract  machines  as 
we  are  allowed  to  delay  substitutions,  and  secondly  it  makes  it  much  easier  to 
prove  them  correct  since  the  calculus  and  its  implementation  are  closer. 

There  are  several  variants  of  calculi  with  explicit  substitutions.  Some  of  these 
variants  are  geared  towards  semantics  [15],  [3],  others  are  derived  with  imple¬ 
mentations  in  mind  [9],  [8],  [2].  Rather  than  listing  all  variants,  we  explain  in 
this  paper  what  we  take  to  be  the  principal  differences  between  them.  This  way 
we  describe  what  appears  at  first  sight  as  various  “design  choices”  for  lambda- 
calculi.  But  we  then  justify  why  we  have  to  develop  calculi  for  each  possible 
choice  if  we  want  to  prove  semantics  and  syntax  equivalent.  Moreover,  by  using 
the  context  handling  of  type  theory  as  a  guide,  we  are  able  to  define  a  conflu¬ 
ent  calculus  with  explicit  substitutions  and  names — something  that  Abadi  et  al. 
were  not  able  to  do. 


1.1  Equations  first  versus  Reductions  first 

There  are  two  main  approaches  when  defining  typed  A-calculi  with  or  without 
explicit  substitutions.  The  first  one,  in  the  spirit  of  Martin  Lof’s  type  theory 
[10],  defines  the  calculus  with  equations-in-context.  Reduction  is  then  a  derived 

*  Research  supported  under  the  EPSRC  project  no.  GR/L28296,  x-SLAM:  The  Explicit 
Substitutions  Linear  Abstract  Machine. 
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notion,  obtained  by  orienting  the  equations.  The  second  approach  considers  the 
set  of  typed  terms  as  a  subset  of  the  set  of  raw  terms,  and  hence  reduction  is 
defined  on  raw  terms,  which  are  not  necessarily  well-formed.  Equality  is  now  the 
derived  notion,  namely  it  is  the  symmetric  and  transitive  closure  of  the  relation 
generated  by  the  reduction  rules. 

The  first  approach  is  required  when  giving  semantics  to  A-calculi  because 
only  well-formed  objects  have  a  meaning.  The  second  approach  avoids  the  need 
to  check  for  well-formedness  during  reduction,  which  is  incorporated  in  the  first 
approach.  As  a  consequence,  this  approach  is  well-suited  for  implementations, 
but  a  semantics  for  terms  can  only  be  given  by  showing  the  equivalence  of  this 
presentation  to  the  Martin  Lof-style  presentation.  Whereas  this  equivalence  is 
easy  to  prove  in  the  case  of  the  simply-typed  A-calculus  (and  hence  it  is  not 
really  necessary  to  differentiate  between  the  two  approaches  in  this  case),  the 
difference  becomes  crucial  as  soon  as  we  add,  for  example,  dependent  types  [14]  . 
This  difference  becomes  crucial  again  when  we  consider  calculi  with  explicit 
substitutions. 

This  paper  presents  calculi  for  both  approaches  and  shows  their  equivalence 
(see  section  3).  This  is  because  we  want  to  connect  the  implementation,  which 
is  based  on  the  second  approach,  with  the  semantics,  which  is  based  on  the  first 
approach. 

1.2  Typed  versus  untyped  calculi 

There  are  typed  and  untyped  calculi  with  explicit  substitutions,  both  of  which 
are  presented  already  in  [1].  The  typing  rules  enforce  two  different  restrictions: 
firstly,  they  eliminate  expressions  with  misuse  of  variables,  e.g.,  ones  where  we  try 
to  substitute  two  different  terms  for  the  same  variable  simultaneously.  Secondly, 
they  ensure  that  the  only  well-typed  A-terms  are  the  ones  of  the  simply-typed 
A-calculus. 


1.3  Names  versus  de  Bruijn  numbers 

Another  important  kind  of  choice  the  designer  of  a  explicit  substitution  A- 
calculus  can  make  concerns  the  difference  between  variable  names  and  de  Bruijn 
numbers.  De  Bruijn  numbers  were  initially  considered,  as  an  implementational 
trick  for  Automath:  instead  of  using  variables  like  x,y,z  de  Bruijn  proposed  to 
use  natural  numbers  (that  correspond  to  the  binding  level  of  the  variable),  in  such 
a  way  that  a  class  of  a-congruent  terms  correspond  to  a  single  syntactic  object. 
Hence  two  expressions  with  variable  names  are  a-equivalent  if  and  only  if  the 
corresponding  terms  with  de  Bruijn  numbers  are  syntactically  equal.  More  than 
simply  an  implementational  trick,  de  Bruijn  numbers  are  helpful  when  defining 
the  semantics  of  the  calculus  in  question.  The  point  is  that  a  de  Bruijn-number 
n  corresponds  exactly  to  the  n-th  projection  An  x  •  •  •  x  Ai-^An- 


^  The  equivalence  proofs  can  still  be  done  [6],  but  some  of  the  required  properties  of  the  type 
theories,  like  confluence  and  subject  reduction,  are  very  hard  to  establish. 
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There  is  a  trade-off  between  a  version  of  the  calculus  with  de  Bruijn  num¬ 
bers  and  a  version  with  names.  Expressions  with  variable  names  are  much  easier 
to  read.  The  difference  becomes  apparent  even  for  relatively  small  terms 
compare  the  expressions  Xx.{Xyz.x){Xz.x)  and  A.(A.A.3)(A.2)).  The  main  draw¬ 
back  of  the  version  with  names  is  the  need  to  identify  terms  which  only  differ  in 
the  name  of  bound  variables:  the  semantics  of  terms  can  only  be  defined  mod¬ 
ulo  a-equivalence.  This  complicates  the  definition  of  the  syntax  significantly,  as 
the  definition  of  a-equivalence  is  rather  involved  (see  section  3).  On  the  other 
hand,  a-conversion  is  not  needed  for  the  version  with  de  Bruijn  numbers,  and 
the  absence  of  a-equivalence  makes  this  better  suited  for  implementations. 

So  a  judicious  use  of  both  versions  seems  the  best  option:  for  the  presentation 
of  results  in  the  meta-theory,  the  version  with  names  is  used,  and  for  implemen¬ 
tations  one  uses  de  Bruijn-terms  to  handle  variable  access.  Of  course  a  good 
implementation  keeps  the  variable  names  as  extra  information  during  reduction 
so  that  terms  can  be  printed  with  names  rather  than  with  de  Bruijn  numbers. 


1.4  Iterated  Substitutions 

The  fourth  choice  concerns  the  need  (or  not)  for  composition  of  substitutions. 

The  precursor  of  the  A<j-calculus,  Curien’s  Ap-calculus  [5],  was  designed  to 
capture  environment  machines  and  had  no  notion  of  iterated  substitutions.  This 
is  rather  restrictive,  as  nested  substitutions  arise  in  several  situations:  during 
reduction  to  normal  form  rather  than  weak  head  normal  form,  when  mod¬ 
elling  sharing  in  environment  machines,  when  modelling  instantiation  in  theorem 
provers,  and  as  the  counterpart  of  composition  in  the  categorical  semantics  of  A- 
calculi.  The  Acr-calculus  was  developed  by  Abadi  et  al.  [1]  with  these  applications 
in  mind.  Iterated  substitutions  seem  to  us  an  essential  part  of  any  Acr-calculus. 


Summing  up 

Summarising,  it  seems  to  us  that  the  first  “design  choices”  are  not  choices  at  all. 
We  must  have  both  the  equations-in-context  and  the  reductions-first  versions, 
both  the  typed  and  untyped  versions  and  both  the  de  Bruijn  and  the  names 
versions,  as  our  goal  is  the  implementation  of  abstract  machines.  It  also  seems 
essential  to  have  composition  of  substitution  for  the  reasons  outlined  above. 
Explicit  weakening  or  not  is,  as  far  as  this  paper  is  concerned,  a  matter  of  taste. 

The  paper  is  structured  as  follows.  We  define  our  calculus  of  explicit  sub¬ 
stitutions  and  equations  in  context  in  the  next  section.  Next  we  discuss  issues 
relating  binding  operations  and  a- equivalences  in  explicit  substitutions  calculi. 
We  prove  the  necessary  syntactical  properties  (confluence  and  normalisation) 
of  our  calculus  and  then  we  examine  the  equivalence  between  the  versions  of 
the  Acr-calculus  with  typed  and  untyped  reduction  rules.  We  conclude  by  briefly 
discussing  implementations  and  applications,  which  are  mostly  future  work. 
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2  A  calculus  with  equational  judgements 

In  this  section  we  present  (with  minor  modifications)  Martin-Lof’s  A-calculus 
with  explicit  substitutions.  This  calculus  is  the  Acr-calculus  by  Abadi  et  al.  but 
with  names  and  equations-in-context.  Tasistro  [15]  describes  this  calculus  and 
gives  ample  motivation  about  the  form  of  the  judgements  and  their  interpreta¬ 
tion. 

2.1  Well- formed  expressions 

We  start  by  presenting  raw  expressions  and  defining  the  judgements  for  well- 
formed  expressions  and  then  give  a  few  intuitions  about  the  calculus. 

Definition  1  Raw  Expressions.  The  types  of  the  Xa-calculus  with  names  are 
base  types  and  function  types  A  =>  B.  The  raw  expressions  of  the  calculus  are 
given  by  the  following  grammar: 

t::=x  I  \x:AT  \  tt  \  f^t  /  ::^  ()  |  {f,t/x)  \  f'J 

We  call  expressions  of  the  first  kind  terms  and  expressions  of  the  second  kind  sub¬ 
stitutions^.  Moreover,  we  write  {tn/^n^^  -  •  ,ti/xi)  for  {•  •  •  {{{),tn/xn)^tn-i/o:^n~i)j 

...  ,ti/xi). 

We  identify  terms  which  are  identical  up  to  change  of  bound  variables.  Be¬ 
cause  not  only  the  A-abstraction  but  also  the  explicit  substitution  f  *  t  binds 
variables,  the  definition  of  bound  variable  is  significantly  more  complex  than  in 
the  A-calculus;  for  a  precise  definition  of  the  notion  of  bound  variable  and  of 
a-equivalence  see  Section  3. 

Judgements  for  well-formed  expressions  require  an  additional  kind  of  raw 
expressions,  namely  contexts.  Such  a  context  is  a  list  xi:  Ai,...  ,Xn-  An  of 
assignments  of  a  type  to  a  variable.  (Contexts  are  called  environments  in  [1].) 
We  call  a  context  well-formed  if  no  variable  occurs  twice  in  it.  Prom  now  on 
we  tacitly  assume  contexts  to  be  well-formed.  We  denote  the  empty  context, 
which  is  the  special  case  of  n  =  0,  by  [  ].  Note  that  contexts  are  lists  rather 
than  multisets;  in  other  words  the  order  is  relevant.  This  approach  generalises 
to  dependent  type  theory  and  is  compatible  with  categorical  semantics.  Because 
contexts  like  x:  A,y\  B  and  y:  B,x:  A  a.re  not  identified,  there  is  an  explicit 
representation  of  the  exchange  rule.  This  avoids  problems  with  the  existence  of 
normal  forms  of  substitutions;  for  details  see  Section  4. 

We  have  two  judgements  for  the  well-formedness  of  raw  expressions,  namely 
r\-  t:  A,  the  usual  H  is  a  term  of  type  A  in  context  T”,  and  Fh  f:  A.  The  last 
judgement  should  be  interpreted  as  “/  is  an  (explicit)  substitution  for  variables 
in  A  where  the  free  variables  of  the  terms  to  be  substituted  are  contained  in 
r” .  Such  a  substitution  roughly  corresponds  to  a  list  of  substitutions  in  the  A- 
calculus.  We  call  any  context  F'  arising  from  F  by  deleting  some  assignments 
Xi'.  Ai  di.  subcontext]  in  that  case  we  write  F'  C  F  and  call  F  an  extension  of  F' . 

2  We  use  the  term  Acr-calculus  as  a  generic  term  for  any  variant  of  the  calculi  presented  in  [1]. 
^  Note  in  particular  the  existence  of  an  explicit  substitution  operator,  denoted  by  which 
takes  a  substitution  /  and  a  term  t  and  returns  a  term  f  *  t. 
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Definition  2  Typing  Judgements.  The  inference  rules  for  the  judgements  F  h 
t :  A  and  F  \-  f :  A  are  as  follows: 

(i)  On  terms: 


r,x:A\-t:B  F  b  t:  A  ^  B  T  h  s:  A  r\-f:AA^t:A 
r,x:  A,r'  h  x:  A  F  ^  \x:  A.t:  A  ^  B  F\-ts:B  TTJTtTA 

(ii)  On  substitutions: 

ri-/:Zi  F\-t:A  F  h  f :  F'  F' ^  g:  F" 
rh  (>:  r  ^  Fh{f,t/x):  A,x:  A  rhf;g:  F" 


The  new  syntax  is  best  explained  by  relating  the  terms  with  explicit  substitu¬ 
tions  to  terms  with  the  usual  implicit  substitution  of  the  simply-typed  A-calculus. 
The  basic  idea  is  that  a  substitution  T  1-  /:  y:  B  in  the  Acr-calculus  corre¬ 
sponds  to  a  list  of  terms  t  =  (fi, . . .  An)  such  that  F  \-  U:  Bi'm  the  A-calculus. 
Moreover,  the  operation  *  models  explicit  substitution:  a  term  f  *t  in  the  Acr- 
calculus  corresponds  to  a  term  t[ti/xi]  (with  the  simultaneous  substitution  of  all 
terms  ti  for  Xi  in  t)  in  the  A-calculus. 

The  operations  and  _)”  model  sequential  and  parallel  composition  of 
substitutions  respectively.  If  T  1-  / :  (x ;  A)  and  x :  A  h  ^ :  A  and  /  and  g  corre¬ 
spond  to  the  lists  t  and  s  respectively,  then  the  substitution  /;  g  corresponds  to 
the  list  (si[t/x],...  ,s^[t/x])  and  hence  models  sequential  composition  of  the 
substitutions  /  and  g.  The  substitution  ()  acts  not  only  as  the  identity  substi¬ 
tution  in  the  sense  that  the  term  ()  *  ^  corresponds  to  t  but  also  as  weakening: 
li  F  \r  t:  A  and  F'  is  an  extension  of  F  then  the  term  F'  i)  ^t:  A  corresponds 
to  the  A- term  F’  \-  t\  A  in  the  extended  context  F' . 

2.2  Equations  and  Reductions 

Now  we  turn  to  the  equations-in-context,  which  are  judgements  F  \-  f  —  g:  A 
and  F  \-  t  ~  s:  A.  This  notion  of  equality  is  sometimes  called  judgemental 
equality.  If  a  judgement  F  \-  f  =  g\  A  can  be  stated  for  any  contexts  F  and  A 
such  that  F  f :  A  implies  T  h  p:  A,  we  will  write  f  =  g  iox  F  \-  f  ~  g\  A. 
Similarly,  if  a  judgement  F  \-  t  =  s:  A  can  be  stated  for  any  context  F  and  type 
A  such  that  F  t\  A  implies  T  f-  5:  A,  we  will  write  t  —  5  for  this  judgement. 
In  section  5  we  will  relate  this  version  of  the  calculus  to  a  version  with  equations 
derived  from  reduction  defined  on  raw  terms. 

Definitions.  The  equations  of  the  Xa-calculus  with  names  are  as  follows: 

(i)  Equations  modelling  (traditional)  X- calculus-reductions: 

(Xx:  A.t)s  =  {(),s/x)  *t  Xx:  A.tx  =  t  if  x  not  free  in  t 
^  We  abbreviate  a  context  xi :  Ai,...  ,Xn‘'-An  to  x:  A.  Similarly  we  write  t[s/x]  for 
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(ii)  Equations  for  substitutions  (In  the  third  rule,  y  =  x  if  y  is  neither  a  free 
variable  nor  a  substitution  variable  in  f,  or  y  is  a  variable  which  is  neither 
a  free  variable  oft  and  f  nor  a  substitution  variable  in  f )  ^ : 

{f,t/x)*x  =  t  (1)  {f,t/y)  *x  =  f  *x  if  X  ^  y  {2) 

f*Xx:A.t  =  Xy:A.{f,  y/x)*t  (3)  /  ♦  {ts)  =  {f  *t){f*  s)  (4) 

();/  =  /  (5)  (6) 

f\{9,tlx)^{f\9J*t/x)  (7)  (8) 

f  *{9*t)  =  {f]9)*t  (9) 

r  \-  f :  A  =  xi :  Ai , . . .  ,Xn-  An 

r  \-  f  =  {f  *  X\lx\,.  .  .  ,  f  ^  Xn/Xn)-  A 


The  first  two  equations  are  the  equations  corresponding  to  /3-and  7?-reduction 
in  the  A-calculus  respectively.  The  equation  for  the  /3-rule  has  a  term  with 
an  explicit  substitution  on  the  right  hand  side  rather  than  an  implicit  sub¬ 
stitution  as  in  the  A-calculus.  This  is  the  place  where  explicit  substitutions 
are  introduced  during  the  reduction  of  A-terms  to  normal  form  in  order  to 
make  the  delay  of  substitution  possible.  The  equations  {l)-(4)  push  substitu¬ 
tions  over  the  constructors  of  A-terms.  The  equation  {f,tjx)  ^  x  =  x  is  the  one 
where  the  replacement  of  the  term  t  for  x  actually  takes  place.  The  equations 
/;  (g-h)  =  {f]9);h  and  f  *  {g  *  t)  =  {f’,9)*t  express  associativity  of  substi¬ 
tution.  The  last  equation  for  substitution  expresses  the  fact  that  substitution 
is  determined  by  its  effect  on  variables.  In  particular,  this  equation  causes  the 
substitutions  (x:  A)  h  {) :  (x:  A)  and  (x:  A)  h  (xilxi):  (x:  A)  to  be  equal. 
This  equation  can  be  thought  of  as  an  ??-rule  for  the  explicit  substitutions.  It 
is  necessary  for  the  definition  of  an  extensional  semantics,  e.g.,  a  categorical 
semantics. 

Definition  4  Reduction  Relations.  The  (typed)  reduction  relations  T  h  W 
t':  A  (over  terms),  and  T  h  /  ^  /':  A  (over  substitutions)  are  defined  by 
orienting  the  above  equations  from  left  to  right. 

Again,  if  a  reduction  rule  can  be  stated  for  any  contexts  T,  A  and  types  A 
such  that  rV-  f:  A  implies  T  h  /' :  Z\  and  T  h  t;  A  implies  P  h  t' :  A,  we  will 
write  f  ^  f  and  W  respectively. 

Before  we  investigate  the  meta-theoretical  properties  of  this  calculus,  we 
examine  a-equivalence  in  detail  in  the  next  section. 

3  a-equivalence 

In  this  section  we  examine  a-equivalence  in  a  Acr-calculus  with  names,  which  is 
more  complex  than  in  the  A-calculus. 

We  aim  to  retain  the  results  for  the  A-calculus,  in  particular  we  want  two 
expressions  to  be  a-equivalent  iff  their  corresponding  de  Bruijn-terms  are  equal, 


^  the  substitution  variables  in  a  substitution  /  are  all  variables  x  occurring  in  an  expression 
{g,t/x)\  for  a  precise  definition  see  Section  3. 
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and  reduction  should  preserve  a-equi valence.  The  latter  causes  problems  which 
are  not  apparent  in  the  A-calculus.  If  we  define  a-equivalence  to  be  the  smallest 
congruence  such  that  \x :  A.t  —  Xy :  A.t[y/x],  then  /3-reduction  does  not  preserve 
Q-equivalence:  the  two  terms  {Xx:  A.t)s  and  {Xy:  A.t[y/x])s  are  a-equivalent, 
but  the  contracta  (s/x)  *  t  and  {s/y)  *  t[y/x]  are  not. 

Hence  we  have  to  define  a-equi  valence  in  such  a  way  that  terms  like  {s/x) 
and  {s/y)  *  t[y/x]  are  a-equivalent.  This  means  that  the  substitution  operator  * 
acts  as  another  binding  operator.  However,  this  is  a  different  kind  of  binding  from 
the  one  A-abstraction  provides:  the  substitution  operator  binds  in  any  expression 
/  *  t  those  variables  in  t  where  there  is  a  term  contained  in  /  which  is  to  be 
substituted  in  t.  In  the  example  {s/x)  *  the  variable  x  is  bound  by  *.  Note 
that  the  substitution  operator  *  does  not  indicate  the  scope  nor  the  name  of  the 
variables  that  it  binds. 

We  define  the  sets  of  free  variables  and  substitution  variables  (which  are 
all  those  variables  in  a  substitution  /  which  are  bound  in  a  term  /  *  t  or  in 
a  substitution  f\g)  by  a  mutual  induction.  The  interesting  cases  for  the  free 
variables  are  FV(Aa::  A.t)  ^  FV(t)  \  {x}  and  FV(/  *  =  FV(/)  U  (FV(i)  \  SV(/)). 

The  substitution  variables  are  defined  by  SV(())  =  0,  SV{(/,  t/a:))  =  SV(/)  U  {x} 
and  SV{f;g)  =  {SV(/)  \  FV(^))  U  SV(^).  A  variable  occurring  in  t  is  called  hound 
in  the  term  t  if  it  is  not  a  free  variable  in  i.  A  variable  occurring  in  /  is  called 
hound  in  /  if  it  is  neither  a  free  variable  nor  a  substitution  variable  in  /. 

In  the  A-calculus  Curry  defines  substitution  before  he  defines  a-equivalence. 
As  the  substitution  has  been  made  explicit,  we  only  need  to  define  renaming 
(i.e.,  the  replacement  of  one  variable  by  another)  as  an  operation  in  the  meta¬ 
theory  to  state  a-equivalence.  This  definition  of  renaming  requires  an  auxiliary 
notion  to  change  the  name  of  the  substitution  variable  x  in  {f,t/x)  to  y,  i.e., 
we  define  an  operation  f  {y/x},  which  satisfies  {f,t/x){y/x}  =  {f,t/y).  This 
name-changing  substitution  is  given  by  ()  {y/x}  =  ();  (/;  g)  {y/x]  =  /;  {g  {y/x}) 
'Axe  SV{g)  and  f;g{y/x}  =  f{y/x};g  if  x  ^  SV(p);  {f,t/x){y/x}  =  {f,t/y) 
and  (/,  t/z)  {y/x}  =  (/  {y/x}  ,  t/z)  'ifx^y. 

Definitions.  We  define  the  renaming  of  the  variable  x  by  the  variable  y  in  t 
or  f  by  induction  over  the  structure  of  raw  expressions. 

x[y/x]  =  y  z[y/x\  =  z  if  z  ^  x 

(Ax:  A.t)[ylx]  ~  Ax:  A.t  {^z:  A.t)[y/x]  =  \w:  A.t[wlz][y/x\(  z  ^  x) 

{tu)[y/x]  =  {t[ylx]){u[y/x])  (/  *  t)[y/x]  -  f  {zifyA  [y/x]  *  t\zilyi][y/x] 

()[y/x]  =  {)  {f,tlz)[ylx\  =  (/[y/x],i[y/x]/z) 

{f\9)[y/x]  =  {f  {zi/yi])[y/x]A9[zi/yi\)[yfx] 

In  the  second  rule  for  X- abstraction,  w  is  equal  to  y  if  x  ^  FV(t)  or  y  ^  FV(s), 
otherwise  w  occurs  neither  free  nor  bound  in  t  or  s.  In  the  rule  for  f  *  t,  the 
variable  zi  is  equal  to  yi  if  yi  ^  FV(s),  otherwise  it  is  a  fresh  variable.  The  same 
condition  applies  for  the  case  f\g. 

The  definition  of  a-equivalence  can  now  be  stated. 

Definition  6.  We  define  a-equivalence  in  the  Xa-calculus  to  he  the  smallest  con¬ 
gruence  relation  on  raw  expressions  including 

Xx :  A.t  Xy :  A.t[y/x\  f  ^  t  =a  f  {y/x}  *  t[y/x]  f;g=a  f  {y/x}  ;  g[y/x] 
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The  variable  y  is  either  x  or  it  is  not  free  in  t,  f  and  g  nor  is  it  contained  in 
SV(/).  In  the  last  two  rules  x  is  bound  by  *  and  ;  respectively. 

Next  we  examine  the  interaction  between  a-equivalence  which  is  defined  on 
raw  expressions,  and  the  judgements.  For  the  typing  judgements,  T  ^  t:  A  means 
there  exists  an  a-equivalent  term  t'  such  that  F  h  t' :  A  according  to  the  rules 
presented  in  Section  2.  A  similar  convention  is  adopted  for  all  other  judgements 
and  for  reduction  on  raw  expressions.  The  next  theorem  justifies  this  convention. 

Theorem  7.  Assume  that  ti  and  t2  are  two  a-equivalent  Xa-terms,  and  assume 
that  fi  and  are  two  a-equivalent  substitutions.  If  F  \-  h:  A,  then  also  F  h 
t2 '  A,  and  similarly  if  F  h-  fi’.  A,  then  also  F  f2:  A.  If  F  ti  =  s:  A,  then 
also  F[-t2  =  s:  A,  and  ifF[-fi=g:  A,  then  Fh  f2  =  g:  A.  If  t  and  s  are  a- 
equivalent  terms  in  the  X-calculus,  then  they  are  a-equivalent  in  the  Xa -calculus, 
too. 

The  Acr-calculus  with  de  Bruijn  numbers  has  no  variable  names  and  hence  also 
no  a-equivalence.  The  intuition  is  that  a-equivalence  is  in  fact  only  a  consequence 
of  the  existence  of  names  and  does  not  affect  the  A(j-calculus  in  any  other  way. 
More  precisely,  equality  modulo  a-equivalence  in  the  calculus  with  names  and 
equality  in  the  Aa-calculus  with  de  Bruijn  numbers  coincide.  The  translation 
from  the  Acr-calculus  with  names  into  the  Acr-calculus  with  de  Bruijn-numbers 
is  defined  by  an  induction  over  the  derivation  and  replaces  each  variable  x  in  a 
context  F,x:  A,F'  by  the  length  |r'|  of  the  context  F'.  For  details,  see  [12]. 

The  results  of  this  section  imply  that  Barendregt’s  variable  convention  can 
be  adopted  in  the  rest  of  this  paper  when  we  prove  meta-theoretic  properties. 
To  be  precise,  we  consider  a-equivalent  terms  to  be  syntactically  equal,  and  in 
the  sequel  we  assume  that  all  bound  variables  occur  nowhere  else  in  a  given 
mathematical  context  {e.g.,  neither  as  free  variables  as  in  x{Xx:  A.x)  nor  as 
substitution  variables  as  in  {f,t/x)  *  Ax:  A. 5). 

4  Confluence  and  Normalisation 

This  section  investigates  confluence  and  normalisation  for  the  (equational) 
Acr-calculus.  We  deduce  confluence  from  the  confluence  of  the  simply-typed  A- 
calculus,  using  a  modularity  argument,  first  described  in  [7]  and  familiar  under 
the  name  “interpretation  method”.  The  argument  is  well-known,  here  we  just 
make  an  effort  to  present  it  in  its  generic  form. 

Definitions  Modularity  Properties.  Assume  that  there  is  a  translation  [-| 
of  the  extended  calculus  into  the  confluent  one  satisfying  the  following  modular¬ 
ity  properties:  Firstly,  if  t  ^  s  in  the  extended  system,  then  also  [t|  {s}. 

Secondly,  for  each  term  t  in  the  extended  system  we  have  t^*  |t|.  Thirdly,  for 
each  reduction  t s  in  the  confluent  system  there  exists  a  reduction  sequence 
t  s  in  the  extended  system. 

In  our  case,  this  general  argument  works  as  follows.  The  translation  |— ] 
works  by  “carrying  out  the  substitutions”,  i.e.,  {{ti/xi)  *  i]  = 
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All  reduction  rules  except  the  /3-rule  (Aa;:  A.t)s  ^  {s/x)  *  t  and  the  T^-rule 
Xx:  A.tx  t  model  explicit  substitution.  We  call  these  rules  cr-rules,  and  we 
denote  a  cr-reduction  by  t  s.  We  expect  the  translation  from  the  Acr-calculus 
to  the  A-calculus  to  map  the  redex  and  the  contractum  of  a  <7-reduction  to  the 
same  A-term.  We  obtain  the  modularity  properties  as  a  consequence;  for  details 
see  the  technical  report  [12]. 

Note  that  the  modularity  properties  do  not  hold  for  the  original  version  of 
the  Acr-calculus  with  names.  In  particular,  the  reduction  (t/x^sjy)  ^  {s/y,t/x) 
violates  the  first  modularity  property. 

Formalising  the  argument  given  before  to  establish  confluence  of  the  Acr- 
calculus  we  obtain  the  desired  confluence. 

Theorem 9  Typed  confluence.  Let  F  \-  t:  A  be  any  well-formed  Xa-term.  If 
t  ti  and  t  t2,  then  there  exists  a  well-formed  Xa-term  F  \-  u\  A  such  that 
ti  u  and  t2  u.  Similarly,  let  F  \-  f  \  A  be  any  well-formed  substitution. 
If  f  fi  and  f  f2,  then  there  exists  a  well-formed  substitution  F  \-  g:  A 
such  that  fi  g  and  f2  g. 

Normalisation  also  arises  as  a  consequence  of  the  modularity  properties  of 
the  translation.  Because  the  proof  consists  of  giving  an  effective  normalisation 
strategy,  we  obtain  decidability  of  equality  in  the  A(j-calculus  as  a  corollary. 

Theorem  10  Normalisation.  Every  well-formed  term  t  and  substitution  f  of 
the  Xa-calculus  has  a  normal  form,  which  can  be  effectively  computed.  The  nor¬ 
mal  form  for  a  term  is  a  normal  X-term,  and  the  normal  form  for  a  substitution 
is  a  lists  of  normal  X -terms. 

Mellies  [11]  shows  that  strong  normalisation  does  not  hold.  As  a  counterex¬ 
ample,  he  gives  a  A-term  which  reduces  to  the  identity  but  which  admits  a 
reduction  sequence  where  a  term  t  reduces  to  a  term  t'  which  contains  t  as  a 
subterm.  But  it  is  possible  to  show  that  all  reduction  strategies  that  reduce  an 
expression  first  to  one  in  weak  head-normal  form  (i.e.,  substitution  is  pushed 
under  A-abstraction  only  if  the  A-abstraction  is  the  outermost  constructor)  lead 
only  to  finite  sequences  of  reductions  [13]. 

5  Reduction  on  Raw  Terms 

The  main  part  of  this  section  examines  a  typed  calculus  with  reduction  defined 
on  raw  terms.  At  the  end  we  mention  briefly  untyped  calculi. 

Apart  from  the  extensionality  rule  for  substitution  F  \-  f  ^  {f^xi/xi) :  x :  A, 
all  reduction  rules  do  not  use  typing  information.  Hence  we  omit  this  rule,  and 
write  for  the  notion  of  reduction  on  raw  terms  given  by  turning  all  reduction 
rules  F  f  g:  A  except  F  h  /  {f  *  xi/xi):  x:  A  into  rules  /  u, 
and  all  reduction  rules  F  \-  t  ^  s:  A  into  rules  t  '^r  s.  For  this  restricted 
fragment,  which  suffices  for  the  design  of  abstract  machines,  we  show  in  this 
section  that  reduction  based  on  raw  terms  and  the  reduction  derived  from  equa- 
tional  judgements  (see  Section  2)  coincide.  The  important  properties  for  this 
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proof  are  uniqueness  of  types  and  subject  reduction,  which  says  that  well-typed 
expressions  reduce  to  well-typed  expressions.  The  same  proofs  that  work  for  the 
simply-typed  A-calculus  with  reduction  defined  on  raw  expressions  work  also  for 
the  system  with  explicit  substititution. 

Now  we  turn  to  the  confluence  proof  for  the  calculus  based  on  reduction 
on  raw  terms.  The  proof  follows  the  general  outline  established  in  the  previous 
section  but  it  does  not  work  directly  because  the  previous  proof  uses  the  fact 
that  every  substitution  reduces  to  a  list  of  terms.  This  is  no  longer  true  if  we  use 
reduction  on  raw  expressions:  substitutions  can  no  longer  be  reduced  to  lists  of 
terms  in  general,  but  only  to  so-called  canonical  forms,  i.e.,  lists  of  terms  with 
an  additional  weakening  at  the  end.  In  particular,  the  substitution  {t/x)-,  {)  is  a 
normal  form  if  i  is  a  normal  form. 

The  details  and  the  adaptation  of  the  confluence  proof  are  given  in  the  tech¬ 
nical  report  [12].  We  only  cite  the  final  theorem. 

Theorem  11.  Let  F  t:  A  he  any  well-formed  Xa-term.  Ift  t\  and  t  t2, 

then  there  exists  a  well-formed  Xa-term  F  u:  A  such  that  ti  u  and  t2 

Similarly,  let  F  \-  f:  A  be  any  well-formed  substitution.  If  f  fi  and  f  f2, 
then  there  exists  a  well-formed  substitution  F  g:  A  such  that  fi  g  and 
h  9- 

Remark  Curien  et  al.  [4]  showed  that  confluence  on  open  terms  fails  for  the 
untyped  Aci-calculus.  To  obtain  confluence  they  introduce  a  special  syntactic 
construction,  which  describes  the  effect  of  pushing  a  substitution  under  a  A- 
abstraction.  (They  consider  a  version  with  de  Bruijn-numbers,  but  the  idea 
should  work  as  well  with  a  calculus  with  variables.) 

The  result  of  good  design  now  follows:  the  judgemental  equality  presentation 
of  our  A(7-calculus  with  names  is  equivalent  to  its  presentation  based  on  reduction 
on  raw  terms. 

Theorem  12.  The  Xa- calculus  with  judgemental  equality  is  equivalent  to  the 
Xa-calculus  based  on  reduction  on  raw  terms.  Thus  F  t  =  s:  A  if  and  only  if 
F  \-  t:  A,  F  \-  s:  A  and  t  s,  where  -H-*  is  the  equivalence  relation  generated 
by  and  similarly  for  substitutions. 

This  confluence  proof  can  also  be  applied  to  the  untyped  Aa-calculus.  The 
reason  is  that  the  translation  of  explicit  substitutions  into  list  of  A-terms  still 
can  be  done.  In  this  way  the  confluence  of  the  untyped  A-calculus  can  be  lifted. 
Obviously,  normalisation  fails  as  any  counterexample  to  normalisation  in  the 
untyped  A-calculus  can  be  reproduced  in  the  calculus  with  explicit  substitutions. 

6  Conclusions 

We  examined  choices  for  designing  calculi  with  explicit  substitutions.  We  pre¬ 
sented  our  own  version  of  a  calculus  of  explicit  substitutions,  for  the  simply  typed 
A-calculus,  for  which  we  showed  the  equivalence  between  its  version  arising  from 
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semantical  (equations-in-context)  considerations  and  syntactic  (reduction  on  raw 
term)  ones.  (This  equivalence  is  crucial  in  establishing  the  correspondence  be¬ 
tween  the  semantics  of  the  calculus  and  its  implementations.)  We  discussed  its 
typed  and  untyped  variants  and  the  names  and  de  Bruijn  flavours  of  the  calcu¬ 
lus.  Also  we  proved  all  the  necessary,  standard,  properties  of  our  calculus.  The 
proofs  are  also  standard. 

This  calculus  contains  what  we  take  to  be  the  essential  points  of  our  approach 
of  using  categorical  type  theory  to  inform  the  implementation  of  abstract  ma¬ 
chines.  Ritter’s  PhD  thesis  is  perhaps  a  more  impressive  example  of  the  same 
approach,  dealing  with  the  Calculus  of  Constructions.  But  the  point  of  the  paper 
is  to  show  how  “inevitable”  this  calculus  is,  given  our  original  goals.  This  is  to 
be  contrasted  with  the  multitude  of  other  explicit  substitution  calculi.  Also  it 
was  necessary  to  clarify  the  case  of  the  simply  typed-lambda-calculus,  to  modify 
it  appropriately,  to  deal  with  the  linear  lambda-calculus.  Linearity  introduces 
several  new  challenges  that  we  are  tackling  at  the  moment. 
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Abstract.  We  provide  a  characterization  of  fan  annihilation  rules  of 
Lamping’s  optimal  algorithm  through  suitable  paths  on  the  initial  graphs 
of  the  evaluation.  This  allows  to  recast  the  computational  complexity  is¬ 
sues  of  the  algorithm  in  terms  of  statics.  The  fruitfulness  of  the  path 
characterization  is  pointed  out  by  proving  the  relationship  between  the 
computational  complexity  of  the  Krivine  machine  and  Lamping’s  algo¬ 
rithm. 


1  Introduction 

At  the  end  of  80’ies,  Lamping  discovered  a  complex  graph  reduction  technique  [6] 
of  A-terms  that  was  optimal  in  the  sense  that  no  redex  is  ever  duplicated  by  the 
algorithm  (cf.  [8]).  This  goal  was  achieved  by  an  ingenious  management  of  shared 
contexts,  using  suitable  sharing  (fan-in)  and  unsharing  (fan-out)  nodes  in  the 
graphs. 

Recently  Asperti  [1]  and,  independently,  Lawall  and  Mairson  [7]  have  shown 
that  Lamping’s  management  of  shared  expressions  may  have  an  exponential 
cost  with  respect  to  the  number  of /3- reductions.  They  also  conjectured  that  the 
total  number  of  fan-annihilations  in  the  reduction  of  a  term  could  provide  a 
reasonable  lower  bound  to  its  “intrinsic  complexity” .  Unfortunately,  very  little 
is  known  about  the  dynamic  aspects  of  Lamping’s  algorithm,  such  as  the  growth 
of  Lamping’s  graphs  (called  sharing  graphs  in  the  following),  the  ratio  between 
application-abstraction  nodes  and  the  other  nodes,  the  exact  cost  of  the  sharing 
management  (which  is  our  utmost  goal). 

So  far,  the  only  dynamic  results  concern  beta-reductions.  In  particular,  in 
[3,  2]  we  provided  a  bijective  correspondence  between  families  of  /3-reductions 
fired  along  the  evaluation  of  a  term  t  and  suitable  paths  in  the  initial  graph  of  t. 
This  result  has  been  used  for  proving  the  correctness  and  coincidence  of  several 
optimal  algorithms,  proving  the  fruitfulness  of  our  approach.  In  this  paper,  we 
apply  the  same  technique  to  cover  other  dynamic  aspects  of  Lamping’s  algorithm, 
giving  a  precise  and  simple  description  of  fan-annihilations  as  suitable  paths  in 
the  initial  term. 

It  turns  out  that  the  computational  complexity  of  Lamping’s  abstract  algo¬ 
rithm  for  AJ-terms  is  a  function  of  fan  annihilations  and  /3-reductions.  In  other 
words,  the  computational  complexity  issues  may  be  recast  in  terms  of  statics, 
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hopefully  a  more  easily  comprehensible  view.  Indeed  we  exploit  the  static  view 
for  proving  that  the  complexity  of  the  Krivine  machine  cannot  be  better  than 
the  number  of  fan  annihilations  in  Lamping ’s  algorithm.  This  is  not  striking,  but 
just  aims  at  emphasizing  the  relevance  of  path  characterizations  for  reasoning 
about  (even  different)  machines. 

Technical  developments  and  proofs  are  missing  in  this  extended  abstract. 
They  may  be  found  at  ftp :  //ftp .  cs  .  unibo .  it/pub/laneve/f ullicalp  .ps  .  gz. 

1.1  Lamping’s  abstract  algorithm 

We  said  that  Lamping’s  algorithm  implements  optimality  through  a  suitable 
sharing  of  subexpressions,  performed  by  explicit  nodes  called  fan.  Fan  nodes, 
together  with  application  and  abstraction  nodes  are  the  core  set  of  nodes  of 
Lamping’s  algorithm.  The  rules  governing  their  interaction  are  illustrated  in 
Figure  1  below. 

a  a 


(Fan-Ann)  (Fan-Comm) 

Fig.  1.  Interaction  rules  of  Lamping’s  abstract  algorithm 

There  is  no  space  here  for  introducing  Lamping’s  algorithm.  The  reader  can 
find  a  smooth  introduction  in  [1].  Remark  only  that  there  are  two  rules  for  eval¬ 
uating  fan-interactions:  one  annihilating  the  two  fans  and  the  other  performing 
duplication.  In  the  abstract  algorithm  described  above  we  have  assumed  the  pres¬ 
ence  of  an  oracle  solving  the  problem  of  which  rule  to  apply  (at  each  time  exactly 
one  rule  may  be  used).  Lamping’s  implementation  of  the  oracle  is  described  in 
Section  2. 
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1.2  The  producer/ consumer  analogy 

The  set  of  rules  in  Figure  1  may  be  split  in  two  groups:  annihilations  and  du¬ 
plications.  Annihilations  and  duplications  are  strongly  related:  in  a  sense  the 
first  ones  “consume”  nodes,  the  latters  “produce”  (by  duplicating)  nodes.  This 
relation  is  evident  in  those  terms  of  the  A/-calculus  whose  normal  form  is  an 
atomic  value  (the  final  graph  is  a  single  edge).  Let  d,  f  and  o  be  respectively 
the  number  of  duplications,  /^-reductions  and  annihilations  along  the  reduction. 
Let  moreover  |M|  be  the  number  of  applications,  abstractions  and  fan  nodes  in 
M .  Since  each  duplication  adds  two  new  nodes  in  the  graph,  each  /3-redex  or 
fan-annihilation  removes  two  nodes  from  it,  and  we  have  no  nodes  at  the  end  of 
the  computation,  the  following  equation  holds: 

|M|-|-2d~2/-2a  =  0 

So,  f  a  =  d-\-  \M\/2.  This  immediately  gives  the  following  property: 

Property  1.  The  length  of  the  abstract  Lamping- evaluation  of  XI- expressions 
yielding  constant  values  only  depends  on  the  families  of  (S-redexes  and  of  fan- 
annihilations. 

1.3  Dynamics  vs.  Statics 

By  Property  1,  the  computational  complexity  of  A-terms  only  depends  on  j3- 
redexes  and  fan  annihilations.  In  [3],  /3-reductions  have  been  successfully  recast 
in  terms  of  suitable  paths  on  syntax  trees  of  A-terms.  In  this  paper  we  are  going  to 
apply  the  same  methodology  to  fan-annihilation  rules,  thus  covering  every  inter¬ 
esting  dynamic  aspect  of  Lamping ’s  abstract  algorithm.  For  instance,  the  reader 
may  observe  that,  in  the  evaluation  of  (2  A),  the  rule  (Fan- Ann)  is  used  twice. 
Consider  one  of  them  and,  going  backward  along  the  reduction,  follow  the  path 
traversed  by  the  two  interacting  fans.  When  you  get  back  to  the  initial  graph, 
you  will  discover  that  each  annihilation  rule  corresponds  to  a  path  in  Figure  2. 
Both  paths  have  a  very  precise  and  similar  structure:  they  start  and  terminate 
at  the  same  fan,  and  can  be  uniquely  decomposed  as  f  A  -0  @  ^  @  “0^  A  where  ^ 
is  a  discriminant  (the  path  from  the  fan  to  the  variable  port  of  the  A),  V’  is  a  vir¬ 
tual  redex,  followed  by  a  ©-cycle  ^  (see  Definition  9)  and  (  )^  is  the  “reversing” 
operation.  In  the  present  paper  we  prove  that  this  decomposition  is  general: 

Property  2.  Fan- annihilations  are  tn  hijeciive  correspondence  with  legal  paths 
in  the  initial  graph  consisting  of  a  discriminant,  a  virtual  redex,  an  @-cycle,  the 
virtual  redex  reverted  and  the  discriminant  reverted. 


1.4  The  comparison  with  Krivine  machine 

Paths  offer  a  fine  grain  description  of  the  evaluation  of  A-terms.  For  this  rea¬ 
son  other  reduction  mechanisms  may  be  reduced  to  path  computations.  As  a 
consequence  the  characterization  of  fan  annihilations  in  terms  of  paths  becomes 
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Fig.  2.  Virtual  fan-annihilations  in  (2  A) 

an  important  step  towards  the  comparison  of  Lamping ’s  optimal  algorithm  with 
other  reduction  techniques. 

For  instance,  Danes  and  Regnier  have  recently  proved  that  each  move  of 
Krivine  machine,  a  well  known  environment  machine  for  functional  languages, 
actually  corresponds  to  a  path  computation.  A  close  inspection  of  this  correspon¬ 
dence,  together  with  Property  2,  allows  us  to  draw  the  following  consequence: 

Prop6rty3.  Lei  M  be  a  XI-tcvTTi  reducing  to  a  constant.  The  length  of  com¬ 
putation  of  M  in  Lamping’s  abstract  algorithm  is  at  most  0{n),  where  n  is  the 
length  of  the  Krivine  machine  computation. 

We  observe  that  this  property  also  gives  more  evidence  to  the  thesis  that  the 
total  number  of  fan-annihilations  in  the  reduction  of  a  term  provides  a  reasonable 
lower  bound  to  its  “intrinsic”  computational  complexity  [1,  7].  We  finally  recall 
that  the  Krivine  machine  may  have  an  exponential  slow-down  with  respect  to 
Lamping  abstract  algorithm  (for  instance  the  evaluation  of  n27c,  where  n  and  2 
are  Church  numbers,  I  is  the  identity  and  c  is  a  constant,  is  0(2"")  in  Krivine 
machines  and  0{n)  in  Lamping’s  algorithm).  This  is  not  very  surprising,  since 
the  Krivine  machine  implements  a  call-by-name  strategy,  which  is  very  inefficient 
for  evaluating  terms. 

2  Pairing  fans:  Lamping’s  full  algorithm 

In  order  to  solve  the  problem  of  correct  fan  pairing,  Lamping  added  a  local  level 
structure  to  the  bidimensional  graphs  presented  in  the  Introduction.  Each  node 
is  decorated  with  an  integer  tag  which  specifies  the  level  at  which  it  lives:  two 
fans  match  if  they  meet  at  the  same  level;  they  mismatch  otherwise.  Further¬ 
more  there  are  two  new  control  nodes  which  operate  on  the  level  structure:  the 
croissant,  which  opens  or  closes  a  level,  and  the  bracket,  which  temporarily  closes 
a  level  or  restores  a  temporarily  closed  one. 
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More  precisely,  sharing  graphs  are  unoriented  graphs  built  from  the  indexed 
nodes  in  Figure  3. 


application  abstraction  fan  croissant  bracket 


Fig.  3.  Sharing  nodes 


The  port  of  a  node  depicted  with  an  arrow  is  called  its  principal  port.  This  is  the 
only  port  where  a  node  can  possibly  interact  with  other  nodes  in  a  graph  reduc¬ 
tion  rule.  The  other  ports  of  each  node  are  called  auxiliary.  It  is  convenient  to 
introduce  particular  names  for  the  auxiliary  ports  of  application  and  abstraction 
nodes.  In  particular,  the  port  of  the  application  leading  to  the  context  (usual 
depicted  at  its  top)  will  be  called  context  port,  while  the  other  auxiliary  port 
will  be  the  argument  port.  In  the  case  of  an  abstraction  node,  the  port  leading 
to  the  body  of  the  function  (usually  depicted  at  the  right  of  the  other  auxiliary 
port)  will  be  called  body  port,  while  the  other  auxiliary  port  is  the  bound  port 
(since  it  leads  to  the  variable  bound  by  the  abstraction). 

Two  nodes  (nodes  of  the  graph)  annihilates  if  they  meet  along  their  prin¬ 
cipal  ports  at  the  same  level.  In  Section  1.1  we  have  already  introduced  two 
annihilation  rules:  (Beta)  and  (Fan-Ann).  The  other  two  annihilation  rules  are 
described  in  Figure  4. 


i 


W  i 


(1) 


(2) 


Fig.  4.  (1)  The  rule  (BRACKET- Ann);  (2)  the  rule  (CROISSANT- Ann) 


A  node  at  a  given  level  can  also  act  upon  any  other  node  /  at  a  higher  level 
(reached  at  its  principal  port),  according  to  the  rules  in  Figure  5  (/  represents 
a  generic  node).  In  these  rules,  the  nodes  are  simply  propagated  through  each 


Fig.  5.  Commutation  rules 


other  in  such  a  way  that  their  effect  on  the  level  structure  is  left  unchanged. 
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Observe  that  rules  (Fan-Comm),  (Fan-App)  and  (Fan-Lambda)  are  instances 
of  the  leftmost  rule  in  Figure  5. 

2.1  The  initial  encoding  of  A- terms 

A  A-term  N  with  n  free  variables  will  be  represented  by  a  graph  with  n  +  1 
entries  (free  edges):  n  for  the  free  variables  (the  inputs),  and  one  for  the  root 
of  the  term  (the  output).  The  translation  is  inductively  defined  by  the  rules  in 
Figure  6.  The  translation  function  is  indexed  by  an  integer  which  can  be  thought 
of  as  being  the  level  at  which  we  want  the  root  to  be;  the  translation  starts  at 
level  0,  i.e.  [M]  ~  [M]o.  I 


Fig.  6.  Initial  translation 

2.2  Consistent  paths  and  the  correctness 

The  correctness  of  Lamping ’s  algorithm  was  proved  by  means  of  suitable  paths 
called  consistent  paths  [6].  Let  us  recall  the  notions  in  [5]. 

The  (finite)  contexts  are  the  terms  generated  by  the  following  grammar: 

a  ::—  □  1  o  |  * -a  |  |  •  o  |  fc{  •  ot  ]  (o-j  o) 

We  denote  by  A„[a]  a  context  of  the  form  (•  *  •  (a,  •  •  -  ai). 


Definition 4.  (Consistent  path)  A  consistent  path  in  a  graph  [M]  is  a  path 
such  that 


1.  every  edge  of  the  path  is  labeled  with  a  context; 

2.  consecutive  pairs  of  edges  satisfy  one  of  the  following  constraints: 


A^[{b,  a)] 


A^[{b,  o-a)] 


A^[{bA-a)] 


A-[{b,  a)] 


A^[{b,a)]  4-[a]  A^[{{b  a),  c)] 


n 

/!'*[((>,'* -a)]  -4"[{a,  □)]  ^"[{6,  (a,  c»] 


a)] 

A  „ 

\  / 


ci)]^"[(6,  tl  ■  o)l  A^lib,  h  •  a)]>l’*[(6.  a>] 
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Consistent  paths  are  taken  equivalent  up  to  contexts.  That  is,  two  consistent 
paths  having  pairwise  equal  edges  are  considered  equal,  even  if  the  contexts 
differ. 

The  above  definition  describes  how  the  nodes  of  the  graphs  modify  the  con¬ 
texts  when  traversed.  Notice  that,  the  traversal  of  a  node  n  can  be  forbidden 
if  the  external  context  does  not  allow  the  transformation  performed  by  n.  As  a 
consequence,  there  are  illegal  (better,  not  consistent)  paths. 

In  order  to  formalize  the  statement  of  correctness  we  need  the  preliminary 
notions  of  residual  and  ancestor  path.  To  this  aim,  let  [M]  —^*  G  ^  G'  and  p 
be  a  path  in  G  which  starts  and  terminates  at  two  principal  ports  The  notion 
of  residual  path  is  not  defined  when  u  is  an  annihilation  rule  and  u  involves  the 
endpoints  of  p. 

By  definition  of  sharing  rules,  u  is  “local”,  namely  it  involves  exactly  two 
nodes  n  and  n'  in  G  and  the  edges  starting  at  these  nodes.  Therefore,  let  Gi,  be 
the  subgraph  of  G  where  n,  n'  and  the  edges  starting  at  n  and  n'  are  missing.  Then 
Gi,  is  also  a  subgraph  of  G'.  If  p  is  internal  to  G^  of  G  then  the  residual  of  p  is  the 
corresponding  path  in  G\>  of  Gb  Otherwise  p  =  pieimiupiej  •  •  •ekif^kU'pk^kPk+i 
such  that  pi  are  internal  to  the  subgraph  G\,  of  G,  piei  or  eJ^pjfc-i-i  may  miss  and 
{nif,pj}  =  {n,  n'}.  There  are  two  cases: 

(tt  is  a  commutation  rule)  Let  us  define  the  cases  when  piCi  or  pieimiu  are  miss¬ 
ing:  the  other  cases  may  be  defined  in  a  similar  way.  In  this  case  the  residual 
of  p  is  niicip2  •  •  •CfcPfc'yfcmJ.cJ.pJt+i)  where  p-  are  the  residuals  of  p*,  c^p-UiinJc; 
are  the  unique  paths  traversing  the  part  of  G'  which  is  not  in  Gi>  such  that 
they  connect  the  ports  which  correspond  to  the  initial  port  of  and  the 
final  port  of  e'-.  The  node  mj  is  consecutive  to  the  initial  node  of  p2  through 
the  port  corresponding  to  the  final  port  of  e'^ . 

{u  is  an  annihilation  rule)  Remark  that  piei  and  e'^p^+i  cannot  be  missing  in 
this  case.  The  residual  of  p  is  p\ci  ••  -Ckp^^i,  where  p-  are  the  residuals  of 
Pj  and  Ci  are  the  edges  connecting  the  ports  which  correspond  to  the  initial 
port  of  Ci  and  the  final  port  of  e[. 

There  is  an  obvious  consequence  of  the  above  definition: 

Propositions.  The  residual  of  a  path  (if  any)  is  unique. 

The  unicity  of  residuals  allows  to  define  the  “inverse”  notion,  called  ancestor. 

A  close  inspection  of  Lamping’s  graph  rewriting  rules  reveals  that  they  pre¬ 
serve  the  consistency  of  paths: 

Property  6.  (The  context  semantics  [5])  Let  [M]  G  ^  G‘ ,  (p  be  a  con¬ 
sistent  path  in  G  and  u  is  not  an  annihilation  rule  involving  the  endpoints  of  (p. 
Then  the  residual  of  ip  does  exist  and  it  is  consistent.  Similarly  for  ancestors. 

^  This  constraint  guarantees  the  unicity  of  the  ancestor.  Indeed,  assume  p  starting  at 
the  auxiliary  port  of  a  node  m,  m  is  involved  in  u  and  u  is  not  the  first  edge  of  p. 
Then  the  residuals  of  p  and  up  should  be  the  same. 
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The  context  semantics  has  been  remarkably  used  for  proving  the  correctness 
of  Lamping’s  algorithm.  In  particular,  in  [5],  the  authors  noticed  that  consistent 
paths  starting  and  terminating  at  root  nodes  (the  root  and  the  free  variables)  are 
invariant  with  respect  to  reduction  rules.  Since  these  paths  suffice  for  defining 
the  Bohm  tree  of  a  A-term,  it  follows  that  the  implementation  is  correct. 


3  Legality,  fan  annihilations  and  cycles 

An  alternative  definition  of  consistency,  called  legality,  has  been  provided  in  [3] 
(the  coincidence  of  consistent  paths  and  legal  paths  is  in  [2]).  The  notion  of  le¬ 
gality  has  the  advantage  (with  respect  to  the  others)  of  clarifying  the  symmetries 
inside  paths,  which,  as  we  will  see,  are  crucial  for  defining  the  path  character¬ 
ization  of  fan  annihilations.  Therefore,  let  us  recall  briefly  the  main  definitions 
and  properties  of  legal  paths. 

A  path  is  straight  if  it  traverses  nodes  form  auxiliary  to  principal  ports  (in 
particular,  the  path  cannot  “bouncing  back”,  exiting  from  the  same  port  it 
entered  through).  A  straight  path  is  elementary  if  one  end  is  connected  at  a 
port  of  a  @-node  or  a  A-node,  it  traverses  control  nodes  only  and  the  other  end 
is  connected  at  a  A-node  or  a  @-node  or  at  a  (free  or  bound)  variable  node.  A 
discriminant  is  an  elementary  path  starting  at  a  bound  variable  (a  discriminant 
represents  an  occurrence  of  the  bound  variable). 

Definition?.  Let  v?  be  a  straight  path  connecting  the  principal  port  of  an  ap¬ 
plication  @  and  an  abstraction  A.  These  two  nodes  are  paired  (along  (p)  if  and 
only  if  either  ^  is  a  redex,  or  every  other  application  and  abstraction  internal  to 
(f  is  paired  (along  a  subpath  of  (p). 

Definitions.  A  straight  path  y?  is  a  well-balanced  path  (shortly  wbp)  if  and 
only  if,  for  each  application  @  and  abstraction  A  paired  along  a  subpath  of  (p, 
the  following  conditions  are  satisfied: 

1.  <p  traverses  @  through  the  context  port  if  and  only  if  it  traverses  A  through 
the  body  port; 

2.  p  traverses  @  through  the  argument  port  if  and  only  if  it  traverses  A  through 
the  bound  port. 

Next  we  define  by  crossed  induction  two  other  types  of  paths:  ©-cycles  and 
V- cycles. 

Definition  9. 

(@-cycle)  Let  @  be  an  application  node  in  [M],  u  be  the  argument  edge  of  @ 
and  N  be  the  second  argument  of  An  @-cycle  of  @  is  a  path: 

1.  where  ip  is  internal  to  N; 

2.  or  ‘  where  ipi  are  internal  to  N  and  f  are 

v-cycles  over  some  free  variable  in  N; 
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(v-cycle)  Let  7  be  a  discriminant  of  A  and  starting  at  v.  A  v-cycle  over  v  is  a 
path  v7A'0'’@<^@'0A7'’v  where  </>  is  a  wbp  starting  at  @  and  terminating  at 

A. 

Definition  10.  (Legal  paths)  A  wbp  ^  is  a  legal  path  if  and  only  if,  for  every 
@-cycle  (j)  contained  in  9?,  ip  can  be  decomposed  in  one  one  of  the  following 
possible  ways: 

(1)  Corner  Cu 

(2) 

(3)  ip  -  (i7A'0@^@(t/’)''>^(7)'’C2,  where  @  and  A  are  paired  along  V'  and  7  is  a 
discriminant. 

In  case  (3),  we  shall  say  that  ip  and  are  respectively  the  call  and  return  paths 
of  the  @-cycle. 

The  above  definition  essentially  says  that  whenever  we  have  a  @-cycle  in  (p  the 
call  and  return  paths,  together  with  the  associated  discriminants,  must  be  the 
same.  Cases  (1)  and  (2)  are  used  to  cover  the  cases  in  which  the  call  or  retuin 
paths  are  not  complete. 

Definition  11.  A  legal  path  @(pX  where  @  and  A  are  paired  along  (p  is  called  a 
virtual  redex. 

This  definition  is  justified  by  the  following: 

Theorem  12.  [3]  Given  a  X-term  M,  there  is  a  one-to-one  correspondence  be¬ 
tween  virtual  redexes  in  [M]  and  all  the  possible  redex  families  obtained  by  eval¬ 
uating  M. 

As  proved  in  [2]  consistent  paths  and  legal  paths  are  strongly  related: 

Theorem  13.  Every  wbp  is  legal  if  and  only  if  it  is  consistent. 

The  main  result  of  the  paper,  namely  the  path  characterization  of  (Fan- Ann) 
moves  is  stated  in  Theorem  14  below. 

Theorem  14.  Let  [M]  G  and  u  be  an  edge  in  G  connecting  the  principal 
ports  of  two  fans.  The  interaction  u  annihilates  the  two  fans  if  and  only  if  the 
ancestor  ofu  is  a  path  ^X'ip@(p@ip''  X^ ,  where  (  is  a  discriminant,  ip  is  a  virtual 
redex  and  <p  is  a  (^-cycle. 

It  is  evident  that  Property  2  is  a  smooth  statement  for  Theorem  14. 

4  The  comparison  with  Krivine  machine 

Theorem  14  looks  particularly  appealing  since  it  provides  a  new  insight  for  rea¬ 
soning  about  dynamics  of  Lamping  abstract  algorithm.  This  insight  relies  on 
path  computations,  which  indeed  is  an  alternative  evaluation  of  A-tems. 
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A  meaningful  application  of  Theorem  14,  which  we  are  going  to  show,  allows 
to  clarify  the  computational  correspondence  between  two  algorithms  for  evaluat¬ 
ing  A-terms:  Krivine  machine  and  Lamping’s  abstract  algorithm.  With  this  we 
mean  that  it  is  possible  to  fix  the  relationship  between  the  lengths  of  computa¬ 
tions  in  the  two  algorithms.  To  this  aim  we  use  a  result  recently  put  forward  by 
Danos  and  Regnier  [4]:  each  step  of  the  Krivine  machine  is  actually  a  suitable 
path  in  the  sharing  graph.  Let  us  give  the  key  intuition  of  Danos  and  Regnier, 
omitting  the  details  of  [4].  The  intuition  follows  by  the  symmetries  inside  paths 
and  two  properties  about  ©-cycles  and  well  balanced  paths  (which  we  omit  to 
recall  since  they  are  not  relevant  in  the  following  discussion).  Take  for  instance 
the  consistent  path 

where  ^  is  a  discriminant  starting  at  a  croissant  at  depth  n,  -0  is  a  virtual  redex 
between  a  A  at  depth  p  (p  <  ri)  and  an  application  @  at  depth  q.  Let  also 
A  =  (•  •  •  (S,  an),  •  •  • ,  ai)  be  the  initial  context  of  p.  A  may  be  rewritten  as  the 
following  pair 

CLn  •  •  ■  '  *  •  •  0,p  •  •  ^ ,  S 

where  a„  Op  ^  is  called  the  environment  and  5  is  called  the  stack.  Now 

we  observe  that: 

1.  By  definition  of  its  final  context  will  have  the  shape  cr  ::  5,  where  a  is 
a  suitable  context  depending  on  an,  •  •  • ,  Op. 

2.  So  we  start  ip  with  the  context  a  ::  S.  Since  ^  is  a  wbp,  by  the  Rendez¬ 
vous  property  in  [2],  the  final  context  of  ip  will  have  the  shape  a  :  :  S,  for 
some 

3.  Now,  by  the  ©-cycle  property  in  [2],  if  we  start  the  ©-cycle  ^  with  a  context 
S'j  a  ::  5,  we  shall  terminate  with  a  context  £' ,  a  ::  S',  for  some  S'. 

4.  The  reverse  path  of  ip  performs  the  reverse  transformation  on  contexts.  So 
at  the  end  of  ip^  we  have  the  context  S,  a  ::  S'. 

5.  For  a  similar  reason,  the  context  at  the  end  of  f'’  has  to  be  an  ap  :: 

S,  a  ::  S'. 

Observe  that,  the  purpose  of  steps  4  and  5  is  to  restore  the  initial  environment 
by  using  the  informations  on  the  top  of  the  stack  and  the  environment  S'.  This 
steps  may  be  skipped  if  we  were  more  careful  in  steps  1  and  2.  That  is,  let  us 
save  the  address  d  of  the  croissant  at  the  beginning  of  f  and  the  environment 
an  ap  ::  S  on  top  of  the  stack  S.  Namely,  the  stack  at  the  end  of  steps 

1  and  2  is  (d,  an  ap)  ::  5.  Then,  at  the  end  of  the  ©-cycle,  we  have  a 

context  S',  (d,  an  ap  ::  S)  ::  5'  and  we  may  safely  skip  steps  3  and  4,  just 

by  restoring  what  is  on  top  of  the  stack  and  jumping  to  d. 

It  turns  out  that  the  above  optimization  corresponds  to  the  step  of  the  Kriv¬ 
ine  machine  performed  when  a  bound  name  is  met,  while  the  step  where  the  pair 
(d,  an  ap)  is  saved  on  the  stack  corresponds  to  the  “stacking”  move  in  the 

Krivine  machine.  The  third  reduction  of  the  Krivine  machine,  the  /3-move,  corre¬ 
sponds  obviously  to  subpaths  which  are  virtual  redexes.  Therefore  the  following 
result: 
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Theorem  15.  [4]  The  optimized  path  computation  is  isomorphic  to  the  Krivine 
machine. 

Since  Theorem  15  gives  a  path  characterization  of  steps  of  Krivine  machine, 
we  may  establish  the  computational  correspondence  between  Krivine  machine 
and  Lamping’s  abstract  algorithm. 

Theorem  16.  Let  M  be  a  Xl-ierm  reducing  to  a  constant.  The  length  of  com¬ 
putation  of  M  in  Lamping’s  abstract  algorithm  is  at  most  0{n),  where  n  is  the 
length  of  the  Kiivine  m.achine  computation. 


5  Conclusions 

The  path  characterization  of  fan  annihilations  has  to  be  meant  as  the  first  step 
towards  the  goal  of  determining  the  total  amount  of  work  required  by  Lamping’s 
(abstract)  algorithm.  We  observe  that  a  direct  evaluation  of  this  parameter  looks 
very  problematic,  especially  since  not  all  sharing- graphs  can  be  obtained  by  the 
reduction  of  a  A-term,  and  nothing  is  known  about  the  structure  of  these  “legal 
graphs”.  As  a  consequence,  no  reasoning  by  induction  on  the  size  or  the  structure 
of  these  graphs  seem  possible.  Vice-versa,  computing  paths  in  a  A-term  looks  as 
a  more  realistic  and  promising  research  direction,  since  in  this  case  we  can  profit 
of  all  the  theoretical  machinery  of  the  geometry  of  interaction  and  its  dynamic 
algebra  [2]. 
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Abstract.  In  this  paper  we  consider  an  on-line  problem  related  to  min¬ 
imizing  the  diameter  of  a  dynamic  tree  T.  A  new  edge  /  is  added,  and 
our  task  is  to  delete  the  edge  e  of  the  induced  cycle  so  as  to  minimize 
the  diameter  of  the  resulting  tree  TU  {/}  \  {e}.  Starting  with  a  tree  with 
n  nodes,  we  show  how  each  such  best  swap  can  be  found  in  worst— case 
O(log^n)  time.  The  problem  was  raised  by  Italiano  and  Ramaswami  at 
ICALP’94  together  with  a  related  problem  for  edge  deletions.  Italiano 
and  Ramaswami  solved  both  problems  in  0(n)  time  per  operation. 


1  Introduction 

The  diameter  of  a  tree  is  the  length  of  a  longest  simple  path  in  the  tree  and 
such  a  path  is  called  a  diameter  path.  The  unique  midpoint  on  all  diameter 
paths  is  called  the  center,  hence  the  center  is  the  point  whose  maximal  dis¬ 
tance  to  any  node  is  as  small  as  possible.  In  1973  Handler  [4]  showed  how  one 
in  linear  time  can  compute  the  diameter  (and  center)  of  a  tree.  However,  as 
pointed  out  by  Rauch  [8],  too  little  work  has  been  done  to  dynamically  main¬ 
tain  information  about  the  diameter.  To  the  best  of  our  knowledge,  the  only 
dynamic  algorithms  concerning  diameters  are  those  given  by  Italiano  and  Ra¬ 
maswami  in  ICALP’94  [5],  motivated  by  problems  in  high-speed  wide-area  net¬ 
works  (see  [6,  7]  for  details).  They  consider  how  to  minimize  the  diameter  of 
a  dynamic  tree  T  with  n  nodes  and  non-negative  edge  cost.  Let  /  be  a  new 
edge  which  introduce  a  cycle  C  in  the  dynamic  tree.  Then  removing  an  edge  e 
from  the  cycle  C  is  called  a  swap{e,  f).  The  best  swap  is  the  swap  which  mini¬ 
mizes  the  diameter  of  the  resulting  tree  T^/e  =  T  U  {/}  \  {e}.  In  this  paper  we 
present  an  on-line  algorithm  for  maintaining  a  dynamic  tree,  such  that  given  a 
new  edge,  the  tree  computed  is  the  tree  resulting  from  the  best  swap.  Italiano 
and  Ramaswami  [5]  presented  an  0(n)  time  algorithm  for  finding  a  best  swap. 
In  this  paper,  we  show  how  to  improve  the  complexity  to  0(log^  n)  worst-case 
time. 

Italiano  and  Ramaswami  [5]  considered  the  above  incremental  best  swap 
problem  as  part  of  a  fully  dynamic  type  heuristic  for  maintaining  a  small  diam¬ 
eter  spanning  tree  T  in  a  dynamic  connected  graph  G.  If  an  edge  e  is  added  to 
G,  the  above  incremental  algorithm  is  called  to  find  a  best  swap  for  T  with  e.  If 
an  edge  e  is  deleted  and  it  belongs  to  T,  they  have  a  complementing  decremen- 
tal  algorithm  that  finds  a  “best  swap”  edge  from  G  reconnecting  T  minimizing 
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the  resulting  diameter  of  T.  They  supported  both  insertions  and  deletions  in 
time  0{n).  Note  that  the  above  scheme  does  not  maintain  the  spanning  tree  of 
the  smallest  possible  diameter.  As  mentioned,  this  paper  does  not  consider  the 
decremental  problem. 

As  an  intermediate  step  to  our  algorithm  we  show  how  to  maintain  a  dynamic 
forest  of  trees  under  link  and  cut  where  given  a  node  from  a  tree  it  returns  the 
diameter  of  the  tree  the  node  belongs  to.  The  time-complexity  is  O(logn)  for 
each  operation,  where  n  is  the  number  of  nodes  in  the  tree(s)  involved.  We  show 
this,  since  to  the  best  of  our  knowledge,  no  such  algorithm  has  been  presented 
before. 

All  our  results  are  based  on  topology  trees  [3,  2]  (the  terminology  of  topology 
trees  is  recalled  in  Section  2).  Our  algorithm  for  maintaining  the  diameter  is 
straightforward,  based  on  a  simple  observation.  Our  algorithm  for  finding  a  best 
swap  is  much  more  involved.  One  complication  is  that  when  we  want  to  merge 
two  clusters,  we  need  to  consider  not  only  the  information  associated  with  the 
clusters  being  merged,  but  the  information  associated  with  O(logn)  sub-clusters 
of  each  of  the  two  clusters.  This  implies  that  a  merge  takes  O(logn)  time,  and 
each  best  swap  gives  rise  to  O(logn)  merges.  Thus  our  0(log^  n)  time  algorithm 
for  best  swap  is  derived. 

The  paper  is  organized  as  follows:  In  section  2  preliminaries  are  given.  In 
section  3  we  present  an  algorithm  for  maintaining  the  diameters  of  trees  in  a 
dynamic  forest.  Finally  in  section  4  we  give  an  algorithm  which  compute  a  best 
swap  in  0(iog^  n)  time. 

2  Preliminaries 

In  this  section  we  give  a  short  presentation  of  the  topology  trees  by  Frederick- 
son  [3,  2].  Our  presentation  differ  slighty  from  the  original  topology  trees.  We 
provide  a  more  simple  interface  in  order  to  simplify  the  use  of  the  topology  trees. 

Let  T  be  a  tree  with  n  nodes.  For  a  connected  subtree  of  T,  we  call  a  node 
which  has  edges  out  of  the  subtree  a  boundary  node.  A  cluster  is  a  connected 
subtree  of  T  with  at  most  two  boundary  nodes.  The  set  of  boundary  nodes  of  a 
cluster  C  is  denoted  dC.  We  say  that  dC  —  {a,  b]  if  C  has  boundary  nodes  a 
and  b  even  if  a  and  b  are  identical.  Two  clusters  are  said  to  be  neighbours  if  they 
intersect  in  exactly  one  node.  A  topology  tree  r  of  T  is  a  binary  tree  such  that:^ 

1.  The  nodes  of  r  represents  clusters  of  T. 

2.  The  leaves  of  r  represents  the  edges  of  T. 

3.  If  (7  is  represented  by  an  internal  node  of  r  with  children  representing  A  and 

B,  then  C  —  A\J  B  and  A  and  B  are  neighbours. 

4.  The  root  of  r  represents  T. 

5.  The  height  of  r  is  O(logn). 


^  In  this  description  all  leaf  clusters  contains  only  one  edge,  however  the  simplification 
presented  in  this  paper  holds  for  any  size  of  the  leaf  clusters. 
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A  tree  with  a  single  node  has  an  empty  topology  tree. 

In  order  to  maintain  topology  trees  for  a  forest  of  dynamic  trees  we  make  use 
of  the  following  operations:  Merge  takes  two  topology  tree  root  nodes  a  and  b  and 
creates  a  new  topology  tree  root  with  children  a  and  h.  By  the  definition  above 
we  have  that  only  nodes  representing  neighbouring  clusters  may  be  Merged. 
DeleteRoot  is  the  reverse  operation,  deleting  the  root  of  a  topology  tree. 

This  presentation  of  the  topology  trees  differ  in  the  interface  from  those 
in  [2,  3].  For  the  topology  trees  presented  we  have  made  no  restriction  on  the 
degree  of  the  tree  for  which  the  topology  tree  is  used  and  we  have  reduced  the 
number  of  different  ways  clusters  can  be  related  to  each  other.  Because  of  the  lack 
of  space,  we  defer  the  description  of  the  modification  to  the  full  journal  version 
in  which  we  will  show  that  it  does  not  change  the  complexity  of  the  topology 
tree  operations.  From  Frederickson  [2,  lemma  1, theorem  2]  and  Frederickson  [3, 
lemma  2.3]  we  have  the  following  proposition  for  topology  trees. 

Proposition!.  A  topology  tree  t  of  o,  tree  T  with  n  nodes,  can  be  computed 
using  a  linear  number  of  Merge  operations.  Topology  trees  for  a  forest  of  trees 
can  be  maintained  under  link  and  cut,  using  0{logn)  Merges  and  DeleteRoots 
per  link  and  cut  operation.  ^ 

Consequently 

Theorem  2.  Let  info  be  some  information  of  clusters  in  a  dynamic  forest  with 
n  nodes  so  that 

1.  For  any  edge  e,  info({e})  can  be  computed  in  time  h. 

2.  For  any  neighbouring  clusters  C\  and  C2,  info((7i  11(72)  can  be  computed  in 

time  t2,  given  info(C'i)  and  info((72). 

Then  we  can  maintain  info  for  all  trees  in  a  dynamic  forest  in  0{ti  +  t2\ogn) 
time  per  link  and  cut,  given  the  ability  to  use  0{n  *  (ti  +  ^2))  time  and  0{n) 
space  for  preprocessing. 

3  Dynamic  Diameters 

In  this  section  we  will  present  a  simple  algorithm  for  maintaining  information 
about  the  size  of  diameters  of  trees  in  a  dynamic  forest  under  link  and  cut  The 
algorithm  will  be  used  in  the  following  section.  It  builds  on  a  generalization  of 
former  exploitations  of  properties  of  diameters  and  spanning  trees  (see  e.g.  [1, 
4,  5]).  This  generalization,  given  in  the  following  lemma,  makes  it  possible  to 
construct  efficient  divide  and  conquer  algorithms. 

Let  T  =  (F,  E)  be  a  tree  with  n  nodes.  With  each  edge  e  in  F?  is  associated  a 
nonnegative  number  cost{e).  For  two  nodes  a,b  £  V  we  then  define  the  distance, 
dist(a,  6),  to  be  the  sum  of  costs  for  all  edges  on  the  simple  path  from  a  to  6  in  the 
tree.  For  a  subset  of  nodes  IF  C  F  we  define  diamriW)  =  maXa,6eTv  dist{a,  6), 
hence  the  diameter  in  the  tree  is  dza7nr(F).  By  the  path  from  a  to  6,  denoted 
a-'-b,  we  mean  both  the  set  of  edges  and  the  set  of  nodes  on  that  path. 
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Lemma  3.  Let  T  ^  {V,E)  he  a  tree,  {a,h}  C  y'  C  V,  {c,d]  C  V"  C  V,  where 
dist(a,6)  “  diarriTiV')  and  dist(c,  d)  =  diarriTiV'')  then 
diarriTiV'  UV")  =  diamT{{ci,b,c,d}). 

Proof.  Assume  for  contradiction  that  diamT{{a,b,c,d})  <  diamriV'  U  V”). 
Then  there  exists  e  E  V'  \  V'\  f  E  V"  \  V\  so  dist{e,  f)  =  diamriV'  U  V")  > 
diamT{{a,  h,  c,d}).  Now  either  e  ^  {a,  6}  or  /  ^  {c,  d}.  Say  e  ^  {a,  b}.  Let  P  de¬ 
note  the  path  e  •  •  •  /.  Let  x,y  E  P  he  the  nodes,  such  that  a  -  ■  -  x  H  P  =  {x} 
and  b---y  n  P  —  {y}.  Now  assume  w.Lo.g.  that  a,x,b,y  is  arranged  such 
that  dist{e,x)  <  dist{e,y).  We  now  have  dist{a,x)  +  dist{x,b)  >  dist{a,b)  > 
dist{e,b)  =  dist{e,x)  dist{x,b),  hence  dist{a,x)  >  dist{e,x)  which  yields 
dist{a,  f)  >  dist{e,  f)  contradicting  our  assumption.  We  therefore  conclude  that 
e  E  {a,b]  and  symmetrically  /  E  {c,  d},  which  concludes  the  proof.  □ 

We  now  show  how  to  use  lemma  3  with  theorem  2.  Given  two  neighbouring 
clusters  Ci  and  C2  of  a  tree  T  we  can  compute  dmmr(C'iUC'2)  given  the  following 
information,  info{C),  for  each  of  the  clusters  Ci  and  €2- 

1.  The  boundary  nodes  dC. 

2.  Two  nodes  a,b  E  C  with  dist{a,b)  =  diamriC). 

3.  The  distances  between  the  nodes  above. 

As  we  will  show  in  the  journal  version,  it  is  now  straightforward  to  prove: 

Theorem  4.  There  exists  an  algorithm  that  maintains  the  diameters  of  trees  in 
a  dynamic  forest  in  time  O(logn)  under  link  and  cut,  given  the  ability  to  use 
0{n)  time  and  space  for  preprocessing,  where  n  is  the  number  of  nodes  in  the 
tree(s)  involved  in  the  operation.  □ 

4  Best  swap 

Given  a  tree  T  with  n  nodes  and  an  edge  /  =  (^1,62)  not  in  T,  we  wish  to  find 
an  edge  e  on  the  cycle  (7  =  61  •  •  •  62  U  {/}  that  yields  the  smallest  diameter  of 
Tf/e  =  TU{f]\{e}. 

Using  theorem  4  we  can  maintain  the  diameter  of  a  tree  dynamically  under 
link  and  cut  using  O(logn)  time  per  operation.  So  when  an  edge  /  is  presented 
we  can  solve  the  best  swap  problem  in  0{k\ogn)  where  k  is  the  number  of  edges 
on  the  cycle  C,  by  simply  trying  them  one  by  one.  But  in  general  this  is  worse 
than  the  0{n)  algorithm  given  by  Italiano  and  Ramaswami  [5],  however  in  this 
section  we  will  provide  an  0(log^  n)  time  solution  to  the  problem. 


4.1  Outline  of  the  algorithm 

If  we  dynamically  maintain  the  diameter  of  the  tree,  using  theorem  4,  we  already 
know  the  diameter  of  the  tree  T  =  Tf,f.  Therefore  we  only  need  to  concentrate 
on  finding  the  edge  e  on  the  path  61  •  •  •  62  which  minimize  the  diameter  of  Tf/e- 
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If  we  remove  an  edge  e  G  6i  ■  •  •  62  from  T,  we  divide  it  into  two  subtrees 
dependent  on  e:  Tb^{e),Tb2{e)  where  61  G  Tb^{e)  and  62  G  Tb^{e).  We  know 
from  lemma  3  that  when  linking  Tb^  (e)  and  Tb2  (e)  with  / ,  the  diameter  of 
the  combined  tree  T//e  is  the  maximum  of  diarn{Tb^{e)),diarn{Tb2{^))  and  the 
longest  path  in  which  includes  /,  denoted  maxpathf{e).  From  now  on 
we  assume  that  Tb-^{e)  is  rooted  in  61  and  Tb^i,^)  is  rooted  in  62-  Then  the 
length  of  the  longest  path  containing  the  edge  /  in  Tfi^,  Tnaxpathf{e),  becomes 
height{Tb,{e))  +  cost{f)  +  heightiTb^ie)).  Since  cost(f)  is  constant,  minimizing 
maxpathf{e)  means  minimizing  height{Tb^{e))  +  height{Tb2{e)). 

To  ease  the  following  discussion,  we  will  introduce  notation  regarding  the 
order  of  edges  and  nodes  on  a  path.  Let  a  •  •  •  6  be  a  path,  and  let  e,  e'  G  a  •  ■  •  b. 
We  then  have  the  order  relation  with  respect  to  a  —  -  b:  e  ^  e'  iff  dist{a,e)  < 
dist{a,e'),  similar  e<e'  iff  dist{a,e)  <  dist{a,e'). 

The  following  theorem,  proven  in  section  4.2,  is  the  basis  of  our  algorithm. 

Theorems.  Let  61,  ^>2  be  nodes  in  a  tree  T  and  let  f  =  (foi,&2)  be  an  edge  not 
in  T.  Then  there  exists  two  nodes  vi,V2  G  61  •  •  •  62  such  that  ^  V2  and  for  any 
edge  e  G  61  •  ■  •  ?>2  •' 

r  diam{Tb2{e))y  if  e  G  61  -  ‘Ui 
diam{Tf/e)  =  <  maxpathf{e),  if  e  G  •  •  •U2 
I  diam{Tb^  (e)),  if  e  G  U2  •  •  •  62 

The  algorithm  consists  of  the  following  steps. 

Algorithm  1 

1.  Find  vi  and  V2- 

2.  Minimize  diam{Tb2  (e))  on  61  •  •  •  . 

3.  Minimize  maxpath /  (e)  on  vi  ■■  ■V2> 

4.  Minimize  diam{Tb^  (e))  on  U2  *  •  *  ^>2- 

5.  Compare  with  diam{T)  and  select  the  best  swap. 

In  section  4.2  we  prove  theorem  5  and  we  show  how  to  find  ui,  ^2  and  how  to 
minimize  the  diameters  of  the  subtrees.  In  section  4.3  we  show  how  to  minimize 
m,axpathf{e),  which  is  the  difficult  part  of  the  algorithm. 

4.2  What  and  how  to  minimize 

In  order  to  prove  theorem  5  we  now  proceed  to  investigate  the  behavior  of 
diam{Tb,  (e)),  diam{Tb2{e))  and  maxpathfie)  when  e  G  61  •  •  •  62.  We  know  that 
when  linking  two  trees,  the  diameter  of  the  resulting  tree  is  greater  or  equal  to 
the  diameters  of  both  the  original  trees.  Whereas  maxpath f{e),  is  not  a  sim¬ 
ple  monotone  function,  it  still  bears  some  relationship  with  diam{Tb^{e))  and 
diam{Tb2{e))  as  we  will  show  in  the  next  two  lemmas. 

Lemma  6.  Let  Ti  be  a  tree  with  root  x.  Let  T2  be  another  tree  and  let  T  be  the 
tree  rooted  in  x  obtained  by  linking  Ti  and  T2  with  some  arbitrary  edge  e.  Then 
height{T)  —  height{Ti)  <  diam{T)  —  diam{Ti)  .  □ 
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Lemma 7.  There  exists  a  node  w  Gbi-  ■  ‘b2  such  that  diam{Ti,^{e)) 

<  maxpathf(e)  when  e  €  bi-‘-w  and  diam{Tb^{e))  >  maxpathf{e)  when  e  £ 
w  ■  ■  ‘b2. 

Proof.  We  prove  it  by  showing  that  as  we  move  the  edge  e  from  bi  to  62, 
diam{Tbi{e))  grows  at  least  as  much  as  maxpathf{e). 

Formally  let  e',  e"  G  61  •  •  ■  62  be  edges  such  that  e'  <  e” .  Then 

maxpathfie")  -  niaxpathf{e') 

=  height {Tbi{e”))  +  height{Tb^{e”))  -  height{Tb^{e'))  -  height{Tb^{e')) 

<  height{Tb^{e"))  -  height{Tb^{e')),  since  height{Tb^(e”))  -  height{Tb^{e'))  <  0 

<  diam(Tb^{e”))  -  diam{Tb^  (e')),  by  lemma  6. 

Thus,  if  there  exists  an  edge  e  =  {xi,X2)  such  that  diam(Tb^  (e))  >  maxpathf{e) 
then  diam(Tb^{e))  >  maxpathf(e)  for  all  edges  e  G  •  62-  By  the  same  argu¬ 
ment  if  there  exists  an  edge  e  =  iyi,y2)  such  that  diam{Tb^{e))  <  maxpathf{e) 
then  diam{Tb^(e))  <  maxpathf{e)  for  all  e  G  61  •  •  •  2/2-  ■  ^2  is  then  the 

node  with  greatest  distance  to  61  such  that  diam{Tb^{e))  <  maxpathf{e)  for 
e  G  61  •  •  ■  rc.  D 

Proof  of  theorem  5.  By  lemma  7  we  know  that  there  exists  a  node  W2  such 
that  diam{Tb^{e))  >  maxpathf{e)  when  e  G  W2”-b2  and  diam{Tb^{e))  < 
maxpathf{e)  when  e  G  61  •  *  ■u;2-  By  symmetry  there  exists  a  node  wi  so 
diam{Tb2{G))  >  maxpathf{e)  when  e  G  61  •  •  -  u^i  and  diam{Tb^{e))  <  maxpathf{e) 
when  e  G  1^2  ■  •  •  62  •  From  this  we  see  that  if  wi  ■<  W2  on  bi '  -  62  then  we 
can  choose  Uj  =  wi,V2  —  W2  and  maxpathf{e)  is  greater  or  equal  to  both 
diam{Tb^(e))  and  diam{Tb2ie))  when  e  G  ui  ••  Otherwise,  if  wi  y  W2  then 

for  all  e  G  61  •  •  ■  62  either  diam{Tb^{e))  >  maxpathf{e)  or  diam{Tb2{e))  > 
maxpathf{e)  since  the  diameter  of  both  the  subtrees  are  as  least  as  great 
as  maxpathfie)  when  e  G  1^2  If  this  is  the  case  then  there  exists  a 

node  V  G  61  ■••62  such  that  diamiTb^^i^))  —  diam{Tbi{e))  when  e  G  bi-'-v 
and  diamiTb.ie))  >  diamiTb^ie))  when  e  G  v-'b2.  In  this  case  we  choose 
=  V2  =  V  which  concludes  the  proof.  □ 

Propositions.  The  nodes  vi  and  V2  can  be  computed  in  O(log^n)  time. 

Proof.  We  have  diam(Tb2{e))  =  diamiTf for  e  G  61  •  •  -  Ui  and  diamiTb^ie))  < 
diam{Tf  /g)  for  e  G  ■  ^2-  Thus,  using  the  topology  tree  structure  of  section  2, 
vi  is  found  by  a  simple  binary  search  where  each  query  is  based  on  linking  and 
cutting  trees  in  O(logn)  time,  as  described  in  theorem  4.  The  node  V2  is  found 
symmetrically.  □ 

Proposition 9,  We  can  minimize  diam{Tb2{e))  on  61  •••ui  and  diamiTb^^ie)) 
on  V2  ’  •  ■  b2  in  0(log  n)  time. 

Proof.  The  edge  e  G  61  ■  •  -  ui  which  minimizes  diam{Tb2ie))  is  simply  the  edge 
with  the  greatest  distance  to  bi  since  dmm(T{,2(e))  is  monotonically  decreasing 
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as  e  moves  from  bi  to  vi.  Similarly  the  edge  minimizing  diam{Tb^  (e))  is  the  edge 
with  greatest  distance  to  62  on  ?;2  ■  ■  •  62.  These  edges  are  easily  found  in  0(log  n) 
time,  using  the  topology  tree  structure  described  in  section  2.  □ 

In  this  section  we  have  shown  that  vi  and  V2  can  be  found  in  0(log^  n)  time. 
In  fact  these  two  nodes  can  be  found  in  0{logn)  time,  which  we  will  show  in 
the  journal  version,  however  because  it  is  rather  technical  and  since  it  would  not 
change  the  overall  complexity  we  have  only  given  the  simple  argument  above. 


4.3  Minimizing  the  sum  of  the  heights 

Recall  from  section  4.1  that  if  the  new  edge  /  is  involved  in  the  diameter  of 
•62,  then  diam{Tf/e)  =  maxpathf{e)  -  height{Tb^{e))-^cost{f)  + 
height{Tb^{e))  and  minimizing  maxpathf{e)  means  minimizing  height{Tb^  (e))  + 
heightiTb^ie)).  By  theorem  5  we  know  that  we  only  need  to  minimize  maxpathf(e) 
on  the  path  i»i  •  •  *  • 

For  any  node  v  and  any  set  of  edges  E\  let  maxdistE'  (v)  denote  the  maxi¬ 
mum  distance  from  v  reachable  in  E'.  For  any  path  P  =  pi  •  •  •  P2  with  pi  /  p2  let 
First{P)  and  Last{P)  denote  the  edges  on  P  incident  topi  and  p2  respectively. 

Let  U  be  the  subtree  of  T,  which  consists  of  all  the  nodes  reachable  from  vi 
(and  V2)  without  using  any  edges  from  61  •  •  •  t;i  U  r»2  •  * '  ^2-  Then  U  is  a,  cluster  of 
T  with  dU  C  {i;i,t>2}- 

For  any  edge  e  E  r’l  •  •  ■  f  2  we  have 

height{Tbi{e))  ~  max{dist{bi,vi)  maxdistu\^e}{'^i)^'^o.^d^^'^T\u{h)} 

=  max{ma.Tdzs%\{e].(fi),maa;dist7^\[/(6i)  —  dist{bi,vi)}  -i-  dist{bi,vi) 

=  max{maxdistif\^e}i'^i)^^^vi}  dist(bi,vi) 
where  =  maxdistT\u{bi)  —  dist{bi,vi) 

height{Tb2{e))  =  max{maxdistu\{e}i'i^2),hv^}  +  dist{b2,V2) 
where  =  maxdist'r\u{^2)  -  dist{b2iV2) 

Thus  in  order  to  solve  the  problem,  all  we  need  to  know  about  the  tree  outside 
U,  is  the  constant  values  and  hv2‘ 

Definition  10.  Let  (7  be  a  cluster  with  dC  =  {a,  b},  let  e  E  a  ■  •  •  6  be  an  edge 
and  let  ha  and  ht  be  any  nonnegative  numbers.  Then  define 
hsumc{e,ha,hb)  ~  max{maxdistc\{e}{(^)^ha}  P  max{maxdistc\{e}{b>),hb}. 

With  this  definition,  we  have  height{Tb-^  {e))+height{Tb2{e))  =  hsumij{e,  ,  71^2) 
for  e  E  •  •  •  U2 . 

Lemma  11.  Let  A,  B  and  C  =  A[J  B  be  clusters  with  dA  =  {a,  c},dB  =  {b,  c} 
and  dC  =  {a,  7^  6  and  let  ha  and  hb  be  any  nonnegative  numbers.  For  any 

edge  ei  E  a  -  ■  ■  c  and  €2  E  C’  •  -  b  we  have: 

hsumc{ei,ha,hb)  =  hsumA{ei,ha,Tnax{maxdistB{b)yhb}  —  dist{b,c))  +dist(b,c) 
hsumc{e2,ha,hb)  —  /iswmB(e2,  max{maa:dfst^(a),  ha]  —  dist{a,  c),hb)  +  dist{a,  c). 
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Proof. 

hsumciei ,ha,hb)  =  max{maxdistc\{e, } (a) ,  ^a}  +  max{maxdistc\{et} (b) ,  ^6 } 

=  mdix{maxdist^\^ei}{o)^ha}  + 

max{max{maa;(ii5it^\{ei}(c)  +  dist{b,c)^maxdistB{b)},hf)] 

—  ma,x{maxdist^\^e^y{a),ha}  + 

ma.x{maxdistA\{ei}{c)  +  dist{b,c),ma,x{maxdistB(b),hb}} 

=  max{maxdistyi\{ej}(a), /la}  + 

“  dist(b,c)}  +  dist{b,c) 

=  hsumA{ei,ha,iocia,x{maxdistB{b),hb}  -  dist{b,c))  -\-dist{b,c) 

The  second  equation  follows  by  symmetry.  □ 

Definition  12.  Let  C,a,b,ha  and  h  be  defined  as  in  lemma  11,  then  define 
BestCutsc{ha,hb)  to  be  the  set  of  edges  e  G  a  •  •  ■  6  minimizing  hsumde,  ha,  h). 

This  definition  of  BestCuts  satisfies  the  following  two  lemmas. 

Lemma  13.  Let  A,B ,C ,a,b,c,ha  and  hb  be  as  in  lemma  11,  then 

BestCutsc{ha,hb)  nA  =  BestCutsA{ha,^dJx{maxdistB(b),hb}  -  dist{b,c))  \/ 
BestCutsc{ha,hb)  nB  =  BestCutsB{rna.x{maxdistA{a),  ha}  -  dist{a,  c),  hb). 

Proof.  If  there  exists  an  edge  ca  in  BestCutsc(ha,  hb)  H  A  then  ca  must  mini¬ 
mize  hsumcicA,  ha,  hb)  on  the  path  a  •  •  ■  c  and  by  lemma  11  it  must  also  mini¬ 
mize  hsumA{eA,ha,max{maxdistB{b),hb}  —  dist{b,c))  on  that  path.  But  then 
BestCutsc{ha,hb)nA  =BestCutsA{ha,i^dix{maxdistB{b),hb} -dist{b,c))  as  de¬ 
sired.  By  symmetry,  if  there  exists  an  edge  eb  in  BestCutsc{ha,  hb)  H  B  then 
BestCutsc{ha ,hb)nB  =BestCutsB (m3i.x{maxdistA {a) ,ha}-  dist{a,  c) ,  hb) .  And 
since  a  ^  b  then  at  least  one  of  ba  and  bb  niust  exist,  yielding  the  desired  re¬ 
sult.  ^ 

Lemma  14.  Let  C,a,b,ha  and  hb  be  as  in  lemma  11,  then 

ha  >  maxdistcip)  ^  Last{a  •  •  •  6)  G  BestCutsc{ha,  hb) 
hb  >  maxdistc(b)  =>  First{a  ■  ■  -  b)  G  BestCutsc{ha,hb). 

Proof.  Assume  ha  >  maxdistc{a).  Then  hsumc{a,ha,hb)  =  ha  A 
max{maxdistc\{e}{b),hb}  for  all  e  G  a- ••6.  But  then  any  edge  minimizing 
maxdistc\{e}{b)  will  also  minimize  hsumde,  ha,  hb)  and  since  Last(a---b)  is 
such  an  edge  we  have  Last{a'-'b)  ^BestCutsc{ha,hb)  which  proves  the  first 
part.  The  second  part  follows  by  symmetry.  □ 

With  this  in  hand  we  may  now  proceed  to  provide  a  procedure  hestcutedge 
that  finds  an  edge  from  BestCutsc{ha,hb).  For  any  cluster  with  only  one  edge, 
it  should  just  return  that  edge.  For  all  other  clusters  we  have  the  following 
proposition. 
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Proposition  15.  Let  A,B ,C ,a,b,c,ha  and  hb  be  as  in  lemma  11.  If  we  define 


bestcntedgec{ha,  hb) 


(eA,iib  =  c,(\dB\  =  l) 
eB,i^a  =  c,  {\dA\  -  1) 

ca,  if  hsumc{eA,ha,hb)  <  hsumc{eB^ha,  hb) 

,  otherwise 


where  ca  ~ 


Last{a  •  -  •  c),  if  /ia  >  maxdistA{a) 
bestcutedgeA{ha^T^soc{maxdistB{b),hb}  —  dist{b,c)) 
otherwise 


and  cb  = 


First{c  •  •  •  6),  if  /i6  >  maxdistB{b) 
bestcutedgeB{^^{’i^axdistA(o),ha}  -  dist(a,c),hb) 
otherwise 


Then  bestciitedgecCha,  hfc)  EBestCutscC^a,  hfc). 

Proof  By  lemma  13  and  lemma  14  we  have  that  either  ca  or  belongs  to 
BestCutsc{ha,hb)  and  since  bestcutedgeciha^hb)  picks  the  one  minimizing 
hsumc{e,ha,hb)  we  have  bestcutedgec{ha,hb)  eBestCutsc{ha,hb)  as  desired. 

□ 

Proposition  15  gives  us  a  recursive  way  of  finding  a  best  cut  edge  for  a  cluster 
in  a  topology  tree.  The  idea  is  now,  for  each  cluster  <7,  dC  =  {a,  5},  to  save 
the  latest  value  found  by  a  call  bestcutedgec{ha,hb)  together  with  ha,  hb  and 
hsumc{ha,hb).  Then  if  the  next  call  bestcutedgec{ha,h[)  has  =  ha  and  hj,  = 
hb,  we  can  immediately  return  the  desired  values  in  constant  time.  Otherwise, 
if  h'^  >  ha  and  h[  >  hb  the  memorization  means  that  we  only  need  to  do  a 
logarithmic  number  of  recalculations,  as  stated  in  the  following  lemma: 

Lemma  16.  Let  C  be  a  node  in  a  topology  tree,  with  dC  =  {a,  6},  let  ha,hb  and 
h'a  >  ha  be  any  nonnegative  numbers  and  suppose  the  last  call  to  bestcutedgec 
vjas  bestcutedgec  (ha,  hfo).  Then  the  number  of  recalculations  needed  to  compute 
bestcutedgec (h^,  hb)  zsO(logn). 

Proof.  From  the  definition  of  bestcutedge  it  is  clear  that  whenever 
bestcutedgec  {h'a,  hb)  makes  two  recursive  calls,  so  does  bestcutedgec{ha,hb),  and 
at  least  one  of  these  calls  is  identical  to  one  made  by  bestcutedgec {h'a,hb).  Fur¬ 
thermore  the  one  new  call  made  by  bestcutedgec {h'a,hb)  only  differ  in  one  pa¬ 
rameter,  so  the  same  argument  can  be  applied  recursively.  Thus  by  induction  at 
most  one  recalculation  can  occur  for  each  level  in  the  topology  tree,  yielding  a 
total  of  O(logn)  recalculations.  D 

Formally,  for  every  cluster  C  in  the  topology  tree,  with  dC  =  {a,  b},  info{C) 
should  include  the  following  information  in  order  for  each  of  the  recalculations 
to  take  constant  time: 

—  dist{a,  b) 

—  maxdistc  (a) ,  maxdistc  {b) 

And  if  C  has  more  than  one  boundary  node: 
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-  ei  =  First{a  •  ■  •  6),  62  ==  Last{a  •  •  •  6) 

-  maxdistc\{e, } (a) , maxdistc\{e, } (^) , maxdistc\{e2}(o) > rnaxdistc\{e2}i^) 

-  e,haM  and  hsumc{e,ha,hb),  where  e  =  bestcutedgec{ha,hb)  was  the  last 

call  to  bestcutedgec  - 

Whenever  a  cluster  C  with  two  boundary  nodes  becomes  the  root  on  a  topology 
tree,  either  by  a  Merge  or  a  DeleteRoot,  it  should  be  initialized  with  a  call 
bestcutedgec  (0, 0). 

Lemma  17.  The  time  needed  to  update  info  during  a  Merge  or  a  DeleteRoot  is 
0{\ogn). 

Proof.  Let  A  and  B  be  the  clusters  we  want  to  merge,  let  C  denote  the  cluster  AU 
B  and  let  dA  =  {a,  c},dB  =:  {6,  c}  and  dC  ^  {a,  b}.  In  order  to  do  the  Merge,  we 
need  to  compute  bestcutedgec  {0,0).  By  proposition  15  this  can  be  done  by  com¬ 
puting  bestcutedgeA{0,TnaxdistB{b)—dist{b,c))  and  bestcutedgeB{^0‘XdistA{(T)~ 
dist{a,c),0).  In  the  structure  we  already  have  bestcutedgeA{0,0)  and 
bestcutedgeB{0,0),  and  so  by  lemma  16  we  only  need  to  recalculate  bestcutedge 
for  O(iogn)  clusters.  Using  lemma  11  and  proposition  15  each  recalculation  can 
be  done  in  constant  time  given  the  information  available,  yielding  a  total  of 
O(logn)  time  for  the  update. 

When  deleting  C  again,  we  have  to  recalculate  bestcutedge  a  {0,0)  and 
bestcutedgeB{0,0).  The  update  made  by  the  Merge  operation  that  created  C 
changed  at  most  O(logn)  clusters,  and  using  lemma  11  and  proposition  15  each 
value  can  be  recalculated  in  constant  time  given  the  information  available.  Thus 
updating  the  structure  under  a  DeleteRoot  can  be  done  in  O(logn)  time.  □ 

By  theorem  2  we  now  have  that  we  can  maintain  info  for  a  topology  tree  r  in 
time  0{log^  n)  per  operation,  such  that  if  C  is  an  internal  node  in  r  with  dC  = 
{a,  6),  and  the  last  call  of  bestcutedgec  was  bestcutedgec  {ha,  hb)  then  for  any 
ha  >  ha  and  /ij,  >  hb  we  can  find  an  edge  e  e  a  - -  b  minimizing  hsumc{e,  h'^,  h[) 
in  time  O(logn)  according  to  lemma  16. 

Given  a  topology  tree  r  and  an  arbitrary  path  P  =  pi  •  ■  'P2,  there  may  not  be 
a  cluster  in  r  where  pi  and  p2  are  boundary  nodes.  Thus  in  order  for  the  search 
described  above  to  work  for  the  path  P,  we  will  have  to  change  the  topology 
tree  to  create  such  a  cluster.  To  do  this  we  will  introduce  the  concept  of  external 
boundary  nodes. 

Let  r  be  a  topology  tree,  let  C  be  an  internal  node  of  r  with  dC  =  {a,  b}, 
and  let  P  be  the  subtree  of  r  with  root  C.  If  we  restrict  ourselves  to  looking  at 
r'  then  C  has  no  boundary  nodes  in  the  normal  sense.  But  the  structure  of  r'  is 
still  exactly  as  if  {a,  b\  were  boundary  nodes.  Formally  we  say  that: 

-  {a,  5}  are  external  boundary  nodes  of  C  in  r'. 

-  {a,  6}  are  mterna/ boundary  nodes  of  C  in  r 

And  we  say  that  P  is  a  topology  tree  with  external  boundary  nodes  {a,  5}.  For¬ 
mally:  For  any  tree  T,  and  any  nodes  a,b  e  T  there  obviously  exists  a  tree  T' 
with  topology  tree  P,  such  that  T  is  represented  by  a  node  in  P  and  dT  =  {a,  b}. 
If  we  let  r  be  the  subtree  of  P  with  root  T,  then  r  is  said  to  be  a  topology  tree 

with  external  boundary  nodes  a  and  b.  In  the  journal  version  we  will  prove  the 

following  lemma 
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Lemma  18.  Given  a  tree  T  with  topology  tree  r  and  two  nodes  pi  and  from 
T.  Then  we  can  change  r  into  a  topology  tree  r'  with  external  boundary  nodes 
Pi  andp2,  and  back,  using  O {log n)  Merge  anc?  DeleteRoot  operations.  □ 

Theorem  19.  There  exists  an  algorithm  for  maintaining  a  dynamic  forest  sup¬ 
porting  link,  cut  and  best  swap  operations  in  0(log^  n)  time,  given  the  ability  to 
use  O(nlogn)  time  and  0{n)  space  for  preprocessing,  where  n  is  the  number  of 
nodes  in  the  tree(s)  involved  in  the  operation. 

Proof.  By  theorem  5  algorithm  1  solves  the  best  swap  problem.  By  proposition  8 
we  can  perform  step  1  in  0(log^  n)  time.  By  proposition  9  we  can  compute  step  2 
and  4  in  (9(logn)  time.  To  solve  step  3  we  do  the  following.  We  cut  at  most  two 
edges  to  obtain  the  subtree  U  containing  the  path  vi-  lemma  18  we  can 

make  vi  and  V2  external  boundary  nodes  in  a  topology  tree  structure  for  U  using 
0(log7i)  Merges  and  DeleteRoois.  Then  we  can  apply  lemma  16  to  find  the  edge 
minimizing  maxpathf  in  O(logn)  time.  Then  we  relink  the  topology  tree  back 
to  its  normal  form  without  external  boundary  nodes.  This  is  done  to  rebuild  the 
structure  that  we  may  use  it  again.  This  can  be  done  using  O(logn)  Merges  and 
DeleteRoois  by  lemma  18.  Since  step  5  amounts  to  comparing  four  numbers  and 
picking  the  smallest,  this  step  clearly  runs  in  constant  time.  Thus  all  steps  in 
the  algorithm  can  be  done  using  O(logn)  Merges  and  DeleteRoois.  By  lemma  17 
both  a  Merge  and  a  DeleteRoot  takes  O(logn)  time  and  using  proposition  1  we 
can  update  the  structure  in  0(log^  n)  time  under  link  and  cut.  By  proposition  1 
and  lemma  17  the  preprocessing  takes  0(n log  n)  time  and  0{n)  space.  □ 
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Abstract.  We  study  budget  constrained  optimal  network  upgrading  prob¬ 
lems.  We  are  given  an  edge  weighted  graph  G  =  E)  where  node  G  U 
can  be  upgraded  at  a  cost  of  c(t;).  This  upgrade  reduces  the  delay  of  each 
link  emanating  from  v.  The  goal  is  to  find  a  minimum  cost  set  of  nodes 
to  be  upgraded  so  that  the  resulting  network  has  a  good  performance. 
We  consider  two  performance  measures,  namely,  the  weight  of  a  mini¬ 
mum  spanning  tree  and  the  bottleneck  weight  of  a  minimum  bottleneck 
spanning  tree,  and  present  approximation  algorithms. 


1  Introduction,  Motivation  and  Summary  of  Results 

Several  problems  arising  in  areas  such  as  communication  networks  and  VLSI 
design  can  be  expressed  in  the  following  general  form:  Enhance  the  performance 
of  a  given  network  by  upgrading  a  suitable  subset  of  nodes.  In  communica¬ 
tion  networks,  upgrading  a  node  corresponds  to  installing  faster  communication 
equipment  at  that  node.  Such  an  upgrade  reduces  the  communication  delay  along 
each  edge  emanating  from  the  node.  In  signal  flow  networks  used  in  VLSI  de¬ 
sign,  upgrading  a  node  corresponds  to  replacing  a  circuit  module  at  the  node  by 
a  functionally  equivalent  module  containing  suitable  drivers.  Such  an  upgrade 
decreases  the  signal  transmission  delay  along  the  wires  connected  to  the  module. 
There  is  a  cost  associated  with  upgrading  a  node,  and  there  is  often  a  budget  on 
the  total  upgrading  cost.  Therefore,  it  is  of  interest  to  study  the  problem  of  up¬ 
grading  a  network  so  that  the  total  upgrading  cost  obeys  the  budget  constraint 
and  the  resulting  network  has  the  best  possible  performance  among  all  upgrades 
that  satisfy  the  budget  constraint. 
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The  performance  of  the  upgraded  network  can  be  quantified  in  a  number 
of  ways.  In  this  paper,  we  consider  two  such  measures,  namely,  the  weight  of  a 
minimum  spanning  tree  in  the  upgraded  network  and  the  bottleneck  cost  (i.e., 
the  maximum  weight  of  an  edge)  in  a  spanning  tree  of  the  upgraded  network. 
Under  either  measure,  the  upgrading  problem  can  be  shown  to  be  NP-hard.  So, 
the  focus  of  the  paper  is  on  the  design  of  efficient  approximation  algorithms. 


1.1  Background:  Bicriteria  Problems  and  Approximation 

The  problems  considered  in  this  paper  involve  two  optimization  objectives, 
namely,  the  upgrading  cost  and  the  performance  of  the  upgraded  network.  A 
framework  for  such  bicriteria  problems  has  been  developed  in  [7].  A  generic  bi¬ 
criteria  problem  can  be  specified  as  a  triple  (A,B,  T)  where  A  and  B  are  two 
objectives  and  F  specifies  a  class  of  subgraphs.  An  instance  specifies  a  budget  on 
the  objective  A  and  the  goal  is  to  find  a  subgraph  in  the  class  F  that  minimizes 
the  objective  B  for  the  upgraded  network.  As  an  example,  the  problem  of  up¬ 
grading  a  network  so  that  the  modified  network  has  a  spanning  tree  of  weight  at 
most  D  while  minimizing  the  node  upgrading  cost  can  be  expressed  as  (Total 
Weight,  Node  Upgrading  Cost,  Spanning  Tree). 

Definition  1.  A  polynomial  time  algorithm  for  a  bicriteria  problem  (A,  B,  F)  is 
said  to  have  performance  (a,  /?),  if  it  has  the  following  property:  For  any  instance 
of  (A,  B,  F)  the  algorithm 

1.  either  produces  a  solution  from  the  subgraph  class  F  for  which  the  value  of 
objective  A  is  at  most  a  times  the  specified  budget  and  the  value  of  objective 
B  is  at  most  jS  times  the  minimum  value  of  a  solution  from  F  that  satisfies 
the  budget  constraint,  or 

2.  correctly  provides  the  information  that  there  is  no  subgraph  from  F  which 
satisfies  the  budget  constraint  on  A. 

1.2  Problem  Definitions 

The  node  based  upgrading  model  discussed  in  this  paper  can  be  formally  described 
as  follows.  Let  G  —  (V,  F’)  be  a  connected  undirected  graph.  For  each  edge  e  6  F", 
we  are  given  three  integers  do{e)  >  di{e)  >  d2{e)  >  0.  The  value  di{e)  represents 
the  length  or  delay  of  the  edge  e  if  exactly  i  of  its  endpoints  are  upgraded. 

Thus,  the  upgrade  of  a  node  v  reduces  the  delay  of  each  edge  incident  with  v. 
The  (integral)  value  c(t;)  specifies  how  expensive  it  is  to  upgrade  the  node  v.  The 
cost  of  upgrading  all  vertices  in  IT  C  U,  denoted  by  c(kF),  is  equal  to  ^(p)* 

For  a  set  W  C  V  of  vertices,  denote  by  dw  the  edge  weight  function  resulting 
from  the  upgrade  of  the  vertices  in  IT;  that  is,  for  an  edge  (u,  v)  E  E 

dw{u^  t>)  di{u^  i?)  where  i  =  \  W  H  {t/,  r?}|. 

We  denote  the  total  length  of  a  minimum  spanning  tree  (MST)  in  G  with  respect 
to  the  weight  function  dw  by  MST(G',  d^)- 
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Definition  2.  Given  an  edge  and  node  weighted  graph  G  -  {V^E)  as  above 
and  a  bound  D,  the  upgrading  minimum  spanning  tree  problem,  denoted  by 
(Total  Weight,  Node  Upgrading  Cost,  Spanning  Tree),  is  to  upgrade 
a  set  W  C  y  of  nodes  such  that  MST(C,  dw)  <  D  and  c{W)  is  minimized. 

We  also  consider  the  node  based  upgrading  problem  to  obtain  a  spanning 
tree  with  the  bottleneck  cost  at  most  a  given  value.  We  denote  the  bottleneck 
weight  (i.e.,  the  maximum  weight  of  an  edge)  of  a  minimum  bottleneck  spanning 
tree  of  G  with  respect  to  the  weight  function  dw  by  MBOT(G',  dw). 

Definitions.  Given  an  edge  and  node  weighted  graph  G  =  [V,  E)  as  above  and 
a  bound  D,  the  upgrading  minimum  bottleneck  spanning  tree  problem,  denoted 
by  (Bottleneck  Weight,  Node  Upgrading  Cost,  Spanning  Tree),  is 
to  upgrade  a  set  IT  C  U  of  nodes  such  that  MBOT(G,dw)  <  D  and  c{W)  is 
minimized. 

Dual  Problems  The  problem  (Total  Weight,  Node  Upgrading  Cost, 
Spanning  Tree)  is  formulated  by  specifying  a  budget  on  the  weight  of  a  tree 
while  the  upgrading  cost  is  to  be  minimized.  It  is  also  meaningful  to  consider 
the  corresponding  dual  problem,  denoted  by  (Node  UPGRADING  Cost,  Total 
Weight,  Spanning  Tree),  where  we  are  given  a  budget  on  the  upgrading  cost 
and  the  goal  is  to  minimize  the  weight  of  a  spanning  tree  in  the  resulting  graph. 

Lemma 4.  If  there  exists  an  approximation  algorithm  for  (Total  Weight, 
Node  Upgrading  Cost,  Spanning  Tree)  with  a  performance  of  then 

there  is  an  approximation  algorithm  for  (Node  Upgrading  Cost,  Total 
Weight,  Spanning  Tree)  with  performance  of{l3,a). 

Proof.  Let  A  be  an  (ce, /?)-approximation  algorithm  for  (Total  WEIGHT,  Node 
Upgrading  Cost,  Spanning  Tree).  We  will  show  how  to  use  A  to  construct 
a  (/?,  Q')-approximation  algorithm  for  the  dual  problem. 

An  instance  of  (Node  Upgrading  Cost,  Total  Weight,  Spanning 
Tree)  is  specified  by  a  graph  G  =  (V,  E),  the  node  cost  function  c,  the  weight 
functions  di,  i  =  0,  1,  2,  on  the  edges  and  the  bound  B  on  the  node  upgrading 
cost.  We  denote  by  OPT  the  optimum  weight  of  an  MST  after  upgrading  a  ver¬ 
tex  set  of  cost  at  most  B.  Observe  that  OPT  is  an  integer  such  that  {n-l)D2  < 
OPT  <  (n  —  l)Do  where  D2  mirie^E  d2{e)  and  Do  :=  maXeeE  do(e). 

We  use  binary  search  to  find  the  minimum  integer  D  such  that  {n  —  1)^2  < 
D  <  (n  —  l)Do  and  algorithm  A  applied  to  the  instance  of  (Node  Upgrading 
Cost,  Bottleneck  Weight,  Spanning  Tree)  given  by  the  weighted  graph  G 
as  above  and  the  bound  D  on  the  weight  of  an  MST  after  the  upgrade  outputs  an 
upgrading  set  of  cost  at  most  aB.  It  is  easy  to  see  that  this  binary  search  indeed 
works  and  terminates  with  a  value  D  <  OPT.  The  corresponding  upgrading  set 
W  then  satisfies  MST((7,  dvv)  <  (^E  <  /^OPT  and  c{W)  <  aB.  □ 

A  result  similar  to  Lemma  4  can  be  shown  for  the  bottleneck  case.  In  view  of 
these  results,  we  express  our  results  for  the  problems  (Total  WEIGHT,  Node 
Upgrading  Cost,  Spanning  Tree)  and  (Bottleneck  Weight,  Node  Up¬ 
grading  Cost,  Spanning  Tree). 
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1.3  Summary  of  Results 

For  the  total  weight  MST  upgrading  problem,  we  derive  our  approximation  re¬ 
sults  under  the  following  assumption: 

Assumptions.  There  is  a  polynomial  p  such  that  Dq  —  D2  <  where 

Do  :=  maXeG£;do(e)  and  D2  miiiegE  <^2(e)  are  the  maximum  and  minimum 
edge  weight,  respectively,  and  n  denotes  the  number  of  nodes  in  the  graph. 

Theorem  6.  For  any  fixed  5  >  0,  there  is  a  polynomial  time  algorithm  which,  for 
any  instance  0/ (TOTAL  WEIGHT,  Node  UPGRADING  CoST,  SPANNING  TrEE) 
satisfying  Assumption  5,  provides  a  performance  of  (1,  (1  s)^0 (log  n)). 

For  the  bottleneck  case,  we  do  not  need  any  assumption  about  the  edge  weights. 

Theorem  7.  There  is  an  approximation  algorithm  for  the  (BOTTLENECK  WEIGHT, 
Node  Upgrading  Cost,  Spanning  Tree)  problem  with  performance  (1,  2  Inn). 

Our  approximation  results  are  complemented  by  the  following  hardness  results: 

Theorems.  Unless  NP  C  DTIME(n^^^°siog^))^  there  can  be  no  polynomial  time 
approximation  algorithm  for  either  (Tot Ah  WEIGHT,  Node  Upgrading  Cost, 
Spanning  Tree)  or  (Bottleneck  Weight,  Node  Upgrading  Cost,  Span¬ 
ning  Tree)  with  a  performance  of  (/(n),  a)  for  any  a  <  Inn  and  any  polyno¬ 
mial  time  computable  function  f.  This  result  continues  to  hold  with  f(n)  —  nfi 
being  any  polynomial,  even  if  Assumption  5  holds. 

Due  to  space  limitations,  the  remainder  of  this  paper  discusses  mainly  the  al¬ 
gorithm  mentioned  in  Theorem  6  above.  Proofs  of  other  results  will  appear  in  a 
complete  version  of  this  paper. 

1.4  Related  Work 

Some  node  upgrading  problems  have  been  investigated  under  a  simpler  model  by 
Paik  and  Sahni  [9].  In  their  model,  the  delay  of  an  edge  is  decreased  by  constant 
factors  of  6  or  when  one  or  two  of  its  endpoints  are  upgraded,  respectively. 
Clearly,  this  model  is  a  special  case  of  the  model  treated  in  our  paper. 

Under  their  model,  Paik  and  Sahni  studied  the  upgrading  problem  for  several 
performance  measures  including  the  maximum  delay  on  an  edge  and  the  diameter 
of  the  network.  They  presented  NP-hardness  results  for  several  problems.  Their 
focus  was  on  the  development  of  polynomial  time  algorithms  for  special  classes 
of  networks  (e.g.  trees,  series-parallel  graphs)  rather  than  on  the  development  of 
approximation  algorithms.  Our  constructions  can  be  modified  to  show  that  all 
the  problems  considered  here  remain  NP-hard  even  under  the  Paik-Sahni  model. 

Edge-based  network  upgrading  problems  have  also  been  considered  in  the 
literature  [1,  4,  5].  There,  each  edge  has  a  current  weight  and  a  minimum  weight 
(below  which  the  edge  weight  cannot  be  decreased).  Upgrading  an  edge  cor¬ 
responds  to  decreasing  the  weight  of  that  particular  edge  and  there  is  a  cost 
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associated  with  such  an  upgrade.  The  goal  is  to  obtain  an  upgraded  network 
with  the  best  performance.  In  [4]  the  authors  consider  the  problem  of  edge- 
based  upgrading  to  obtain  the  best  possible  MST  subject  to  a  budget  constraint 
on  the  upgrading  cost  and  present  a  (1  +  £,  1  4-  l/gj-approximation  algorithm. 
Generalized  versions  where  there  are  other  constraints  (e.g.  bound  on  maximum 
node  degree)  and  the  goal  is  to  obtain  a  good  Steiner  tree,  are  considered  in  [5], 
Other  references  that  address  problems  that  can  be  interpreted  as  edge-based 
improvement  problems  include  [3,  8,  10]. 

2  Upgrading  Under  Total  Weight  Constraint 

In  this  section  we  develop  our  approximation  algorithm  for  the  (Total  Weight, 
Node  Upgrading  Cost,  Spanning  Tree)  problem.  Without  loss  of  general¬ 
ity  we  assume  that  for  a  given  instance  of  (Total  Weight,  Node  Upgrading 
Cost,  Spanning  Tree)  the  bound  D  on  the  weight  of  the  minimum  spanning 
tree  after  the  upgrade  satisfies  D  >  MST(G,c/2),  i.e.,  the  weight  of  an  MST 
with  respect  to  ^2,  since  node  upgrading  cannot  reduce  the  weight  of  the  min¬ 
imum  spanning  tree  below  this  value.  Thus,  there  always  exists  a  subset  of  the 
nodes  which,  when  upgraded,  leads  to  an  MST  of  weight  at  most  D.  We  remind 
the  reader  that  our  algorithm  also  uses  Assumption  5  (stated  in  Section  1.3) 
regarding  the  edge  weights  in  the  given  instance. 

2.1  Overview  of  the  Algorithm 

Our  approximation  algorithm  can  be  thought  of  as  a  local  improvement  type 
algorithm.  To  begin  with,  we  compute  an  MST  in  the  given  graph  with  edge 
weights  given  by  do(e).  Now,  during  each  iteration,  we  select  a  node  and  a  subset 
of  its  neighbors  and  upgrade  them.  The  policy  used  in  the  selection  process  is 
that  of  finding  a  set  which  gives  us  the  best  ratio  improvement,  which  is  defined 
as  the  ratio  of  the  improvement  in  the  total  weight  of  the  spanning  tree  to 
the  total  cost  spent  on  upgrading  the  nodes.  Having  selected  such  a  set,  we 
recompute  the  MST  and  repeat  our  procedure.  The  procedure  is  halted  when 
the  weight  of  the  MST  is  at  most  the  required  threshold  D.  To  find  a  subset  of 
node  with  the  best  ratio  improvement  in  each  iteration,  we  use  an  approximate 
solution  to  the  Two  Cost  Spanning  Tree  Problem  defined  below. 

Definition 9  Two  Cost  Spanning  Tree  Problem,  Given  a  connected  undirected 
graph  G  —  (U,  £’),  two  edge  weight  functions,  c  and  /,  and  a  bound  B,  find  a 
spanning  tree  T  of  G  such  that  the  total  cost  c(T)  is  at  most  B  and  the  total  cost 
l(T)  is  a  minimum  among  all  spanning  trees  that  obey  the  budget  constraint. 

The  above  problem  can  be  expressed  as  the  bicriteria  problem  (c-Total 
Weight,  /-Total  Weight,  Spanning  Tree).  This  problem  has  been  ad¬ 
dressed  by  Ravi  and  Goemans  [11]  who  obtained  the  following  result. 

Theorem  10.  For  all  e  >  0,  there  is  a  polynomial  time  approximation  algorithm 
for  the  Two  Cost  Spanning  Tree  problem  with  a  performance  of  (1  -1-  £,  1). 
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2.2  Algorithm  and  Performance  Guarantee 

The  steps  of  our  algorithm  are  shown  in  Figure  1.  This  algorithm  uses  Procedure 
Compute  QC  whose  description  appears  in  Figure  2. 


Algorithm  Upgrade  MST(t?) 

•  Input:  A  graph  G  =  {V,E),  three  edge  weight  functions  do  >  di  >  ^2,  a  node 
weight  function  c,  and  a  number  D,  which  is  a  bound  on  the  weight  of  an  MST  in 
the  upgraded  graph;  a  “guess  value”  O  for  the  optimal  upgrading  cost. 

1.  Initialize  the  set  of  upgraded  nodes:  Wo  :=  0. 

2.  Let  To  :=MST{G,dwo)- 

3.  Initialize  the  iteration  count;  i  :=  1. 

4.  Repeat  the  following  steps  until  for  the  current  tree  Ti-i  and  the  weight  function 
dvv,_i  have:  dwi_i{Ti-i)  <  D: 

(a)  Let  Ti-i  :=  MST(6',  dw»_i )  be  an  MST  w.r.t.  the  weight  function  dw^_i- 

(b)  Call  Procedure  COMPUTE  QC  to  find  a  marked  claw  G  with  “good”  quotient 
cost  g{G).  Procedure  COMPUTE  QC  is  called  with  the  graph  G,  the  current 
MST  Tt_i,  the  current  weight  function  dwi_i  and  the  bound  O. 

(c)  If  Procedure  COMPUTE  QC  reports  failure,  then  report  failure  and  stop. 

(d)  Upgrade  the  marked  vertices  M(G)  in  C:  W{  :=  Wi-i  U  M{G). 

(e)  Increment  the  iteration  count:  i  i  -\-l. 

•  Output:  A  spanning  tree  with  weight  at  most  Z),  such  that  total  cost  of  upgrading 
the  nodes  is  no  more  than  (l+6‘)i7*Cl(log  n),  provided  Q  >  OPT.  Here,  OPT  denotes 
the  optimal  upgrading  cost  to  reduce  the  weight  of  an  MST  to  be  at  most  D. 


Fig.  1.  Approximation  algorithm  for  node  upgrading  under  total  weight  constraint. 


Before  we  embark  on  a  proof  of  Theorem  6,  we  give  the  overall  idea  behind 
the  proof.  Recall  that  each  basic  step  of  the  algorithm  consists  of  finding  a  node 
and  a  subset  of  neighbors  to  upgrade. 

Definition  11.  A  graph  C  ~  [V^E)  called  a  claw^  if  E  is  of  the  form  E  — 
{  (t;,  ui)  :  ic  G  \  {u}  }  for  some  node  u  €  U.  The  node  v  is  said  to  be  the  center 
of  the  claw.  A  claw  with  at  least  two  nodes  is  called  a  nontrivial  claw. 


Let  W  be  a  subset  of  the  nodes  upgraded  so  far  and  let  T  be  an  MST  with 
respect  to  dw'-,  that  is,  T  =  MST(G,  dw).  For  a  claw  C  with  nodes  M{C)  C  C 
marked,  we  define  its  quotient  cost  q{C)  to  be 


g(C)  := 


_ cjMjC)) _ 

dw  {T)  -  MST(7’  U  C,  dwuM{c)) 


,  if  M{C)  ^  0, 


and  Too  otherwise.  In  other  words,  q{C)  is  the  cost  of  the  vertices  in  M(C) 
divided  by  the  decrease  in  the  weight  of  the  MST  when  the  vertices  in  M{C) 
are  also  upgraded  and  edges  in  the  current  tree  T  can  be  exchanged  for  edges  in 
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the  claw  C.  Notice  that  this  way  the  real  profit  of  upgrading  the  vertices  M{C) 
is  underestimated,  since  the  weight  of  edges  outside  of  C  might  also  decrease. 

Our  analysis  essentially  shows  that  in  each  iteration  there  exists  a  claw  of 
cpiotient  cost  at  most  where  T  is  the  weight  of  an  MST  at  the  beginning 

of  the  iteration  and  W  are  the  nodes  upgraded  so  far.  We  can  then  use  a  potential 
function  argument  to  show  that  this  yields  a  logarithmic  performance  guarantee. 


Procedure  Compute  QC(f7) 

•  Input:  A  graph  G  =  (P,  E),  a  spanning  tree  T  and  a  weight  function  d  on  E; 
W  C  y  is  the  set  of  upgraded  nodes;  a  “guess”  Q  for  the  optimal  upgrading  cost. 

1.  Let  s  ;=  flogi^.  O']. 

2.  For  each  node  v  and  all  K  6  {1,  (1  +  f))  (1  +  £')^)  •••,(!+  do 

(a)  Set  up  an  instance  of  the  Two  Cost  Spanning  Tree  Problem  as  follows: 

-  The  vertex  set  of  the  graph  Gv  contains  all  the  vertices  in  G  and  an 
additional  “dummy  node”  x. 

-  There  is  an  edge  (u,  x)  joining  v  to  the  dummy  node  x  of  length  l(v,  x)  = 

0  and  cost  —  c(p)  thus  modeling  the  upgrading  cost  of  v. 

-  For  each  edge  (p,ty)  G  E,  Gv  contains  two  parallel  edges  h  and  hup. 
The  edge  h  models  the  situation  where  w  is  not  upgraded: 


c(/i)  :=  0 


m 


d2(p,tc) 

di(u,  tc) 


Similarly,  hup 

c{hup)  I  ^ 


models  an  upgrade  of  w\ 

if  W  7/7  \  J  /  \ 


if  lu  eW 
if  w 


-  For  each  edge  (u,  w)  G  T,  there  is  one  edge  (u,  w)  E  E  which  has  length 
l{u^  lu)  =  d(^l,  iv)  and  cost  c(u,  w)  =  0. 

-  The  bound  B  on  the  c-cost  of  the  tree  is  set  to  K. 

(b)  Using  the  algorithm  mentioned  in  Theorem  10,  find  a  tree  of  c-cost  at  most 
(1  +  £)K  and  /-cost  no  more  than  that  of  a  minimum  budget  K  bounded 
spanning  tree  (if  one  exists).  Let  Tyj^:  be  the  tree  produced  by  the  algorithm. 

3.  If  the  algorithm  fails  for  all  instances  Iv,k  then  report  failure  and  stop. 

4.  Among  all  the  trees  Tv,k  find  a  tree  Tv\k*  which  minimizes  the  ratio 
c(Tv*j<^)/{d(T)-l{Tv*,K*)). 

5.  Construct  a  marked  claw  C  from  Tv*,k*  as  follows: 

—  The  center  of  G  is  v*  and  p*  is  marked. 

-  The  edge  (tP,  tc)  is  in  the  claw  G  if  Tv\k*  contains  an  edge  between  v*  and 
■w.  The  node  w  is  marked  if  and  only  if  the  edge  in  Tv*,k*  between  and 
w  has  c-cost  greater  than  zero. 

•  Output:  A  marked  claw  C  (with  its  center  also  marked)  with  quotient  cost  g(G) 
satisfying  q{C)  <  2(1  -f  g)^  d{^~D  ^  (f  +  ^)^- 


Fig.  2.  Algorithm  for  computing  a  good  claw. 
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2.3  Bounded  Claw  Decompositions 

Definition  12.  Let  G  =  E)  be  a  graph  and  W  C  V  a  subset  of  marked 
vertices.  Let  «  >  1  be  an  integer  constant.  A  K-bounded  claw  decomposition  of 
G  with  respect  to  W  is  a  collection  Ci, . . .,  Cr  of  nontrivial  cla^vs,  which  are  all 
subgraphs  of  G,  with  the  following  properties: 

1-  ULi  ViCi)  =  y  and  \JLi  E{Ci)  =  E. 

2.  No  node  from  W  appears  in  more  than  k  claws. 

3.  The  claws  are  edge-disjoint. 

4.  If  a  claw  Ci  contains  nodes  from  then  its  center  belongs  also  to  W, 

Lemma  13.  Let  F  he  a  forest  in  G  =  {yi  E)  and  let  W  G  V  be  a  set  of  marked 
nodes.  Then  there  is  a  2-bounded  claw  decomposition  of  F  with  respect  to  W.  □ 

Lemma  14.  Let  T  :=  Ti-i  be  an  MST  at  the  beginning  of  iteration  i  with  W  := 
Wi-i  being  the  nodes  upgraded  so  far.  Let  U  C  V  be  a  set  of  nodes.  Let  T'  — 
MST(G,  c/tyuc/)  be  a  minimum  spanning  tree  after  the  additional  upgrade  of  the 
vertices  in  U.  Then,  there  is  a  bijection  (p  :T  ^T'  with  the  following  properties: 
1.  For  all  edges  e  E  T  fl  T'  we  have  ^(e)  =  e,  2.  dwuu[F[^))  ^  dw[€)  for  all 
e  E  T,  3.  the  “swaps"  e  (p[e)  transform  T  into  T' ,  and  f.  ~ 

dwuu{p{o)))  =  dw{T)  —  dwuu{T').  □ 

Lemma  15.  Let  T  Ti-i  be  an  MST  at  the  beginning  of  iteration  i,  i.e., 
T  —  MST(G,  dy/),  where  W  :=  VK_i  is  the  upgrading  set  constructed  so  far. 
Then  there  is  a  marked  claw  C  (where  its  center  v  is  also  marked  and  v  ^W) 
with  quotient  cost  q{C)  satisfying 

2  OPT 

Proof.  Let  T  =  MST(G,  d^uoPT)  be  an  MST  after  the  additional  upgrade  of 
the  vertices  in  OPT.  Clearly,  dwuoPT(T')  <  D.  Apply  Lemma  13  to  T'  with  the 
vertices  in  Z  OPT  \  W  marked.  The  lemma  shows  that  there  is  a  2-bounded 
claw  decomposition  of  T'  with  respect  to  Z.  Let  the  claws  be  Ci, . . . ,  Cr.  In  each 
claw  Cj,  the  corresponding  nodes  M(Cj)  :=  Cj  fl  Z  from  Z  are  marked.  Since 
the  decomposition  is  2-bounded  with  respect  to  Z,  it  follows  that 

^c(M(C,))<2.0PT.  (1) 

i=i 

Moreover,  the  cost  c{M{Cj))  of  the  marked  nodes  in  each  single  claw  Cj  does 
not  exceed  OPT,  since  we  have  marked  only  nodes  from  Z.  By  Lemma  14,  there 
exists  a  bijection  (p:T  ->  T'  such  that 

(dw(o)  —  dtyuOPT(9^(€))^  =  dw{T)  —  dwuoPT(T')  >  dw{T)  -  D.  (2) 
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For  each  of  the  claws  Cj  with  M(Cj)  0  in  the  2-bounded  decomposition  of  T 
its  quotient  cost  q(Cj)  satisfies 

o(C  \  <  (3) 

-  T.^(e)^cMw{e)-dwuoM9m'  '  ^ 

since  we  can  exchange  the  edges  v^(e)  (e  G  Cj)  for  the  corresponding  edges  e  in 
the  current  tree  T  after  the  upgrade  and  thus  decrease  the  weight  of  the  tree  by 
at  least  “  <^wuopt  (^/’(e))). 

Let  C  be  a  claw  among  all  the  claws  Cj  with  minimum  g'(C).  Then, 

‘  ~  ^W^UOPT  {<p{e)))  <c{M(Cj))  fori  =  l . r.  (4) 

e  £  Cj 

Notice  that  the  above  equation  holds,  regardless  of  whether  M{Cj)  is  empty  or 
not.  Summing  the  inequalities  in  (4)  over  j  —  1, . . . ,  r,  and  using  Equations  (1) 
and  (2),  it  can  be  seen  that  C  is  a  claw  with  the  desired  properties.  □ 


2.4  Finding  a  good  claw  in  each  iteration 

Lemma  15  implies  the  existence  of  a  marked  claw  with  the  required  properties. 
We  will  now  deal  with  the  problem  of  finding  such  a  claw. 

Lemma  16.  Suppose  that  the  hound  Q  given  to  Algorithm  UPGRADE  MST  sat¬ 
isfies  i?  >  OPT.  Then,  for  each  stage  i  of  the  algorithm,  it  chooses  a  marked 
claw  C  such  that 

OPT 

<?(C')<2(l  +  T^^(r)_^  and  c{M(Cj)<{l  +  e)a, 

where  T  :=  71  _i  is  an  MST  at  the  beginning  of  iteration  i  and  W  :=  Wi-i  is 
the  set  of  nodes  upgraded  so  far. 


Proof.  By  Lemma  15,  there  is  a  marked  claw  C  with  quotient  cost  q{C)  at  most 
2 diviTf-D "  ^  center  of  this  claw.  By  Lemma  15,  v  is  marked.  Let 

c{C)  :=  c{M{C))  be  the  cost  of  the  marked  nodes  in  C  and  L  :=  MST(T  U 
C,dwuM{c))  t'he  weight  of  the  MST  in  T  U  C  resulting  from  the  upgrade  of 
the  marked  vertices  in  C.  Then,  by  definition  of  the  quotient  cost  q{C)  we  have 


nir\  ^  .  OPT 

’  dw(T)-  L-  dw{T)-D' 


(5) 


Consider  the  iteration  of  Procedure  Compute  QC  when  it  processes  the 
instance  of  Two  Cost  Spanning  Tree  Problem  with  graph  Gy  and  c{C)  < 
K  <  {I  e)  ■  c{C).  The  tree  MST(T  U  C,dwuM{C))  induces  a  spanning  tree  in 
Gy  of  total  c-cost  at  most  c{C)  (which  is  at  most  K)  and  of  total  /-length  no 
more  than  L.  Thus,  the  algorithm  from  Theorem  10  will  find  a  tree  Ty^j{  such 
that  its  total  c-cost  c(Ty^K)  is  bounded  from  above  by  (1  -f  6)K  <  (1  +  €)‘^c(C) 
and  of  total  /-length  /(Ty,ic)  no  more  than  L. 

By  construction,  the  marked  claw  C  computed  by  Procedure  Compute 
QC  from  Ty^K  has  quotient  cost  at  most  c(Ty ^k) / (dw (T)  — /(T^^/^)),  which  is  at 
most  (1  +  s)^ c{C) / {dw (T)  —  L).  The  lemma  now  follows  from  (5).  □ 
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2.5  Guessing  an  Upper  Bound  on  the  Improvement  Cost 

We  run  our  Algorithm  UPGRADE  MST  depicted  in  Figure  1  for  all  values  of 

Q  G  +  +  +  where  i  [logi.^,  c(U)l . 

We  then  choose  the  best  solution  among  all  solutions  produced.  Our  analysis 
shows  that  when  OPT  <  f?  <  (1  +  e)  •  OPT,  the  algorithm  will  indeed  produce  a 
solution.  In  the  sequel,  we  estimate  the  quality  of  this  solution.  Assume  that  the 
algorithm  uses  /  +  1  iterations  and  denote  by  Ci, . . . ,  C/,  Cy-j-i  the  claws  chosen 
in  Step  4b  of  the  algorithm.  Let  c(M(Ci))  denote  the  cost  of  the  vertices 
upgraded  in  iteration  i.  Then,  by  construction 

c,  <  (l+6)f2<  (l  +  ff)'OPT  for  (6) 


2.6  Potential  Function  Argument 

We  are  now  ready  to  complete  the  proof  of  the  performance  stated  in  Theorem  6. 
Let  MSTi  denote  the  weight  of  the  MST  at  the  end  of  iteration  i,  i.e.,  MST,-  := 
dw,  {Ti ) .  Define  <f)i  :=  MSTi  -  D.  Since  we  have  assumed  that  the  algorithm  uses 
/  +  1  iterations,  we  have  >  1  for  i  =  0, . . .,/  and  <pf+i  <  0.  As  before,  let 
a  c{M(Ci))  denote  the  cost  of  the  vertices  upgraded  in  iteration  L  Then 


Lemma  16 

^  -  (MSTi  -  MSTj+i)  < 


a -OPT 


(7) 


where  a  :=  2(1  -f  We  now  use  an  analysis  technique  due  to  Leighton  and 
Rao  [6].  The  recurrence  (7)  and  the  estimate  ln(l  -  r)  < -r  give  us 


E 

i=i 


< a • OPT 


‘"E- 


(8) 


Notice  that  the  total  cost  of  the  nodes  chosen  by  the  algorithm  is  exactly  the 
sum  Ci.  By  (8)  and  (6)  we  have 


^ c,  <  (1  +  6)^0FT  +  2(1  +  sfOFT  .  in  (9) 

i=l 

We  will  now  show  how  to  bound  In  Notice  that  (j)f  —  MST/  —  £)>!,  since 
the  algorithm  uses  /-f  1  iterations  and  does  not  stop  after  the  /th  iteration.  We 
have  (j)o  =  MSTq-D  <  (n-l)(Do -D2),  where  Do  and  D2  denote  the  maximum 
and  the  minimum  edge  weight  in  the  graph.  It  now  follows  from  Assumption  5 
that  liK^o  G  Cl(log(np(77.)))  C  Cl(logn).  Using  this  result  in  (9)  yields 


/+i 

<  [l-Fef  ■  OPT  +  2(l  +  e)^(!l(logn)  •  OPT  G  (1  +  (9 (log 77)  •  OPT.  □ 

i-l 
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3  Concluding  Remarks 

Our  algorithms  produced  solutions  in  which  the  budget  constraints  were  strictly 
satisfied.  This  is  unlike  many  bicriteria  network  design  problems  where  it  is 
necessary  to  violate  the  budget  constraint  to  obtain  a  solution  that  is  near- 
optimal  with  respect  to  the  objective  function  [7]. 

An  open  problem  that  arises  immediately  from  our  work  is  whether  there  is 
a  good  approximation  algorithm  for  the  (Total  WEIGHT,  Node  Upgrading 
Cost,  Spanning  Tree)  problem  even  when  Assumption  5  is  not  satisfied.  It 
is  also  of  interest  to  investigate  whether  our  results  for  spanning  trees  can  be 
extended  to  Steiner  trees.  Other  open  problems  under  the  node-based  upgrading 
model  can  be  formulated  using  different  performance  measures  for  the  upgraded 
network.  Some  measures  which  are  of  interest  in  this  context  include  bottleneck 
weight,  diameter  and  lengths  of  paths  between  specified  pairs  of  vertices. 
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Abstract.  The  formalism  of  monadic  second-order  (MS)  logic  has  been 
very  successful  in  unifying  a  large  number  of  algorithms  for  graphs  of 
bounded  treewidth.  We  extend  the  elegant  framework  of  MS  logic  from 
static  problems  to  dynamic  problems,  in  which  queries  about  MS  proper¬ 
ties  of  a  graph  of  bounded  treewidth  are  interspersed  with  updates  of  ver¬ 
tex  and  edge  labels.  This  allows  us  to  unify  and  occasionally  strengthen 
a  number  of  scattered  previous  results  obtained  in  an  ad-hoc  manner  and 
to  enable  solutions  to  a  wide  range  of  additional  problems  to  be  derived 
automatically. 

As  an  auxiliary  result  of  independent  interest,  we  dynamize  a  data  struc¬ 
ture  of  Chazelle  and  Alon  and  Schieber  for  answering  queries  about  sums 
of  labels  along  paths  in  a  tree  with  edges  labeled  by  elements  of  a  semi¬ 
group. 


1  Introduction 

Many  graph  properties  can  be  expressed  via  formulas  in  a  suitable  logic.  E.g., 
for  given  vertices  s  and  ^  in  a  directed  graph,  the  fact  that  the  subgraph  spanned 
by  a  set  A  of  edges  contains  a  path  from  s  to  t  can  be  expressed  by  saying  that 
every  vertex  set  U  containing  s,  but  not  t,  can  be  left  via  an  edge  in  A,  i.e.,  by 
the  formula 

Joins{A,  s,  t)  =  yU  {{s  £  U  At  ^  U)  :=> 

3e3u3v[tail{u^  e)  A  head{v,  e)Ae£AAuE:U  Av^  C/)), 

where  e  ranges  over  all  edges,  u  and  v  range  over  all  vertices,  U  ranges  over  all 
sets  of  vertices,  and  tail{u^e)  and  head(v,e)  express  that  u  and  v  are  the  tail 
and  the  head  of  e,  respectively.  If  we  want  the  graph  spanned  by  A  to  be  just  a 
single  (simple)  path  from  s  to  t,  we  can  additionally  require  A  to  be  minimal, 
i.e.,  Path(A^s,t)  =  Joins{A^  s,t)  A\/B{{B  C  A  A  Joins{B,  s^t))  B  —  A), 
where  B  ranges  over  all  sets  of  edges. 

Expressing  computational  problems  such  as  ‘Ts  there  a  path  from  s  to  t?” 
in  a  formal  framework  holds  out  the  prospect  of  deriving  algorithms  to  solve 
such  problems  in  an  automatic  way.  Indeed,  every  graph  property  expressible  in 
first-order  logic  can  be  decided  in  polynomial  time.  The  catch  is  that  first-order 
logic  is  too  weak  to  express  most  graph  properties  of  interest  (see,  e.g.,  (Cour- 
celle,  1990a)).  It  allows  variables  ranging  over  vertices  and  edges,  existential  and 
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universal  quantification  over  such  variables,  the  usual  logic  connectives  A,  V, 
and  -I,  and  predicates  such  as  tail  and  head  for  accessing  the  basic  connectiv¬ 
ity  structure  of  the  graph  under  consideration.  Very  frequently,  however,  one  is 
led,  as  in  the  examples  above,  to  introduce  variables  ranging  not  over  individual 
vertices  or  edges,  but  over  sets  of  vertices  or  edges.  Extending  first-order  logic 
with  this  possibility,  we  arrive  at  monadic  second-order  (MS)  logic.  As  noted  by 
many  researchers,  MS  logic  is  a  powerful  language  that  allows  the  expression 
of  a  wide  range  of  graph  properties.  Indeed,  the  collection  of  decision  problems 
on  graphs  that  can  defined  by  MS  formulas  is  so  large  that  it  includes  many 
NP-complete  problems,  leaving  little  hope  of  obtaining  efficient  algorithms  for 
the  general  case.  Rather  than  reverting  to  a  less  expressive  logic,  one  can  try  to 
evade  this  problem  by  restricting  the  class  of  input  graphs.  Arnborg  et  al.  (1991) 
argue  that  a  particularly  felicitous  combination  is  to  consider  problems  definable 
by  an  MS  formula  on  graphs  of  bounded  treewidth^  i.e.,  on  graphs  drawn  from 
a  class  with  a  uniform  upper  bound  on  the  treewidth  of  all  graphs  in  the  class. 
Loosely  speaking,  the  treewidth  of  a  graph  is  a  measure  of  how  far  the  graph 
deviates  from  being  a  tree.  The  details  of  the  definition  will  be  provided  in  the 
next  section. 

Consider  a  single  MS  formula  with  /  free  set  variables  (such  as  “A’’  in 
the  formula  “Pa^/i(A,  s,  t)”)  and  without  free  simple  variables.  gives  rise  to 
several  computational  graph  problems:  First,  there  is  the  decision  problem  of 
determining  whether  there  are  sets  Ai, . . . ,  A/  of  vertices  or  edges  that  satisfy 
if  substituted  for  its  free  variables  (e.g.,  “Is  there  a  path  from  s  to  t?”).  For  this 
first  type  of  problem  it  is  not  necessary  to  allow  ^  to  have  free  variables — we 
might  as  well  quantify  them  existentially;  still,  we  keep  the  present  formulation 
for  the  sake  of  uniformity.  Second,  the  counting  problem  of  detecting  the  num¬ 
ber  of  such  tuples  (e.g.,  “How  many  (simple)  paths  are  there  from  s  to  t?”). 
Third,  if  the  input  additionally  associates  each  vertex  or  edge  a  with  an  /-tuple 
(/i(a), . . . , //(a))  of  real  numbers,  whose  Ah  element  is  interpreted  as  the  cost 
of  including  a  in  A^,  for  z  =  1, . . . ,  /,  the  optimization  problem  of  computing  the 
minimal  cost  of  a  tuple  (Ai, . . . ,  A/)  that  satisfies  (e.g.,  “What  is  the  distance 
from  s  to  /?”).  Fourth,  in  the  same  setting,  the  construction  problem  (this  is 
not  a  standard  term)  of  actually  computing  a  tuple  (Ai,...,Af)  satisfying  ^ 
and  of  minimal  cost  (e.g.,  “Which  path  from  s  to  /  is  shortest?”).  And  fifth,  if 
fi(a)  is  reinterpreted  as  the  probability  of  a  stepping  into  Af,  for  i  =  1, . . .,/, 
with  each  vertex  or  edge  entering  each  set  independently  of  all  other  such  ran¬ 
dom  decisions,  the  reliability  problem  of  computing  the  probability  of  obtaining 
a  tuple  (Al, . . . ,  A;)  that  satisfies  0  (e.g.,  “What  is  the  probability  of  having  an 
operational  path  from  s  to  /?”). 

Results  by  Courcelle  (1990b)  and  Bodlaender  (1996a)  imply  that  every  deci¬ 
sion  problem  defined  by  an  MS  property  can  be  solved  in  linear  time  on  graphs 
of  bounded  treewidth.  Generalizations  of  these  and  related  earlier  results  to 
counting,  optimization,  construction,  and  reliability  problems  were  investigated 
by  a  number  of  authors  (Arnborg  et  al.,  1991;  Bern  et  al.,  1987;  Bodlaender, 
1993a;  Borie  et  ah,  1992;  Courcelle  and  Mosbah,  1993;  Stearns  and  Hunt,  1996). 
One  of  the  simplest  and  most  general  extensions  was  suggested  by  Courcelle 
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and  Mosbah  (1993),  and  we  will  essentially  use  their  framework.  In  our  formu¬ 
lation,  a  generic  algorithm  is  instantiated  by  choosing  a  particular  commuta¬ 
tive  semiring  %  =  [R,  ©,  0, 1),  i.e.,  an  algebraic  structure  consisting  of  a  set 

i?,,  equipped  with  two  associative  and  commutative  operations  0  and  (g)  with 
neutral  elements  0  and  1,  respectively,  such  that  0 _distributes  over  0  (i.e., 
a®(b(^c)  =  (a ©6)0 (<2  0 c)  for  all  a,b,ce  R)  and  cx©0  =  0  for  all  a  G  i?.  Given 
an  input  graph  G  with  associated  functions  /i, . . (which  will  be  called  cost 
functions,  independently  of  their  interpretation),  the  generic  algorithm  computes 
the  value  of  G  under  and  11,  defined  as  the  quantity 

i 

IGl^^TZ  —  ^  /*(«)? 

G\=^[Ai,...,Ai]  i  =  l  aeAi 

i.e.,  the  ‘‘sum”,  over  all  tuples  that  satisfy  #,  of  the  “products”, 

over  the  sets  Ai,  of  the  appropriate  costs.  With  suitably  chosen  commutative 
semirings,  this  can  be  shown  to  solve  the  problems  mentioned  above  as  well  as  a 
number  of  additional  problems.  For  example,  with  1Z  =  ({0, 1,  2, . . +,  0, 1), 

we  obtain  a  solution  to  the  counting  problem. 

The  problem  of  computing  |G|^,7^  for  fixed  ^  and  1Z  will  be  called  static, 
meaning  that  the  entire  input  as  well  as  the  question  to  be  answered  are  known 
from  the  outset.  The  focus  of  this  paper  is  to  extend  the  elegant  framework 
of  MS  logic  to  a  dynamic  setting  in  which,  following  a  certain  initialization  or 
preprocessing  based  on  the  input  graph,  a  sequence  of  attribute  updates  and 
queries  must  be  executed  online,  i.e.,  each  query  must  be  answered  before  the 
next  operation  to  be  executed  is  revealed.  An  (attribute)  update  changes  a  single 
attribute  of  a  vertex  or  edge  without  affecting  the  structure  of  the  graph.  One 
might  also  consider  structural  updates  that  insert  or  delete  vertices  or  edges. 
The  data  structures  and  algorithms  described  here  can  easily  be  extended  to 
allow  deletions,  but  supporting  insertions  of  vertices  and  edges  appears  to  be 
considerably  more  difficult;  see  (Bodlaender,  1993b)  for  results  in  this  direction 
in  the  case  of  graphs  of  treewidth  2.  We  allow  boolean  attributes,  which  take 
values  in  {false,  true},  indicate  (non)membership  in  “user-defined”  sets,  and 
may  be  tested  in  ^  through  corresponding  predicates,  and  ring  attributes,  which 
take  values  in  R,  together  define  the  cost  functions,  and  cannot  be  referred  to 
in  A  query  temporarily  (for  the  duration  of  the  query)  carries  out  a  constant 
number  of  updates  of  boolean  and/or  ring  attributes,  thereby  changing  G  into 
G',  and  then  computes  and  returns  after  which  all  attributes  revert  to 

their  values  before  the  query.  This  view  of  a  query  operation  may  be  unfamiliar, 
but  it  is  general  and  permits  a  convenient  statement  of  our  results. 

Our  running  example  centered  around  the  the  MS  formula  Path[A,  s,t)  will 
be  used  to  clarify  some  of  the  concepts  introduced  above.  We  have  already  seen 
that  Path{A,  s,t)  expresses  that  the  edges  in  A  span  a  (simple)  path  from  s  to 
t,  and  if  we  give  each  edge  e  a  ring  attribute  /(e)  equal  to  its  length,  the  length 
of  the  path  spanned  by  A  is  minimized  by  choosing 

1Z-{1RU  {oo},  min,  +,  oo,  0).  What  is  lacking  is  that  we  would  like  to  support 
queries  asking  for  the  distance  from  s  to  f  (call  this  an  (s, f)-query),  where  s 
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and  i  are  variable.  We  can  achieve  this  effect  within  the  general  framework  by 
introducing  two  “user-defined”  sets,  5  and  T,  both  initialized  to  0,  letting  an 
(s,  t)-query  temporarily  change  two  boolean  vertex  attributes  to  make  S  —  {s} 
and  T  —  {/},  and  using  instead  of  the  original  formula  the  formula 

3s3t{Origin[s)  /\  Destination[t)  f\  Path{A^  s^t)), 

where  Origin  and  Destination  are  predicate  symbols  corresponding  to  the  sets 
S  and  T.  It  should  be  clear  that  other  traditional  types  of  queries  can  be  for¬ 
mulated  in  a  similar  way.  We  show  that  for  all  r  >  1,  the  dynamic  version  of 
every  problem  defined  by  an  MS  formula  #  and  a  commutative  semiring  Tl  whose 
operations  can  be  carried  out  in  constant  time  (call  such  a  semiring  efficient) 
can  be  solved  on  n- vertex  graphs  of  bounded  treewidth  with  initialization  time 
0(7?),  (attribute-)update  time  and  query  time  0(r-t-  Q;(n)),  where  a 

is  a  slowly-growing  “inverse  Ackermann”  function.  Alternatively,  for  arbitrary 
integer  k  >  1,  with  the  same  update  time,  but  initialization  time  0(nlk{n)) 
and  query  time  0(t  +  k),  where  4,  for  every  integer  A;  >  1,  is  another  slowly- 
growing  function.  Both  a  and  the  functions  are  defined  in  the  next  section.  In 
the  special  case  of  the  dynamic  distance  and  shortest-path  problems  considered 
above,  this  result  was  obtained  previously  by  Chaudhuri  and  Zaroliagis  (1995) 
for  r  =  0(1)  as  well  as  with  a  worse  tradeoff  between  initialization  time,  update 
time,  and  query  time.  In  more  detail,  Chaudhuri  and  Zaroliagis  indicate  the  fol¬ 
lowing  bounds,  for  all  integers  r  >  1:  Initialization  time  0(c’'n),  update  time 
0(c^^n^^  ’^),  and  query  time  0(c^”a(n)),  where  c  =  0(3”).  In  order  to  compare 
these  bounds  with  ours,  observe,  e.g.,  that  in  order  to  achieve  an  update  time 
of  0(2  V^^°^)  with  the  bounds  of  Chaudhuri  and  Zaroliagis,  it  is  necessary  to 
choose  r  larger  than  |loglogn,  which  yields  a  query  time  of  (log77)^(^°§^°5«)^ 

whereas  our  bounds  associate  an  update  time  of  0(2V^°^)  with  a  query  time 
of  0(v^og  n).  Our  bounds  are  never  worse  than  those  of  Chaudhuri  and  Zaro¬ 
liagis,  and  strictly  better  for  all  nonconstant  r  and  r.  One  end  of  the  tradeoff, 
with  update  and  query  times  both  O(logn),  was  demonstrated  previously  by 
Bodlaender  (1993b). 

If  only  queries  but  no  updates  are  to  be  supported,  we  achieve  initialization 
time  0{n)  and  query  time  0(a(n))  or,  for  every  integer  /?  >  1,  initialization  time 
0(nlk('n))  and  query  time  0(k).  This  result  was  found  previously  by  Chaudhuri 
and  Zaroliagis  (1995)  for  the  distance  and  shortest-path  problems  and  by  Arikati 
et  al.  (1995)  for  the  problem  of  computing  (the  value  of)  a  minimum  cut  sepa¬ 
rating  two  given  vertices.  (The  value  of  a  minimum  cut  separating  s  and  t  can 
be  found  by  minimizing  fi^)^  where  /(e)  denotes  the  capacity  of  the  edge 

e,  subject  to  \fB(Path{B^  s^t)  (An  B  ^  0)),  where  A  and  B  range  over  all 
sets  of  edges.) 

In  some  cases,  queries  may  become  cheaper  if  they  can  be  batched.  We  con¬ 
sider  queries  that  (temporarily)  change  at  most  d  boolean  attributes  and  no 
ring  attributes  and  use  the  term  exhaustive  d- dimensional  query  to  denote  a  set 
of  all  possible  queries  of  this  type  (e.g.,  the  well-known  all-pairs  shortest-paths 
problem  is  to  answer  an  exhaustive  2-dimensional  query).  We  can  show  that  for 
all  d  >  1,  exhaustive  d-dimensional  queries  defined  by  an  MS  formula  and  an 
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efficient  commutative  semiring  can  be  answered  in  O(n^)  time  for  n-vertex  input 
graphs  of  bounded  tree  width.  This  was  proved  previously  for  d  =  1  and  d  -  2 
for  the  distance  problem  by  Radhakrishnan  et  ah  (1992)  and  for  d  =  2  for  the 
problem  of  computing  (the  value  of)  a  minimum  cut  by  Arikati  et  ah  (1995). 

All  of  the  algorithms  described  above  translate  into  parallel  algorithms  for 
the  EREW  PRAM.  Due  to  space  limitations,  we  omit  further  discussion  of  ex¬ 
haustive  queries  and  parallel  algorithms. 


2  Definitions 

As  introduced  by  Robertson  and  Seymour  (1986),  a  tree  decomposition  of  a  graph 
G  -  (E,  E)  is  a  pair  (Td,^),  where  Td  =  {Vd,  ^d)  is  a  tree  and  U  =  {U^  \  x  e 
Vb}  is  a  family  of  subsets  of  V  called  bags,  one  for  each  node  in  Td,  such  that 

(1)  =  V  (every  vertex  in  G  occurs  in  some  bag); 

(2)  for  alF  w,  V  G  1^,  if  w  and  v  are  the  endpoints  of  some  edge  in  E,  then  there 

exists  an  G  Vb  with  {u,  v}  C  (every  edge  in  G  is  “internal”  to  some 

(3)  for  all  x,y,z  G  Vb,  if  V  is  on  the  (simple)  path  from  a:  to  2:  m  Td,  then 
Ux^Uz  C  Uy  (every  vertex  in  G  occurs  in  the  bags  in  a  connected  part  of 
Td,  i.e.,  in  a  subtree). 

The  width  of  a  tree  decomposition  (Td  =  (Vb), -£^d),  {^4  |  ^  €  Vb})  is 
maxxev^  |f4|-l-  The  treewidth  of  a  graph  G  is  the  smallest  treewidth  of  any  tree 
decomposition  of  G.  Many  important  graph  classes  are  of  bounded  treewidth, 
including  those  of  outerplanar  and  series-parallel  graphs;  for  surveys  of  results 
of  this  kind,  see  (van  Leeuwen,  1990)  and  (Bodlaender,  1996b). 

Define /o  :  W  =  {1,2,...}  N  hy  Io{n)  =  [71/2],  for  all  n  G  iV.  Inductively, 

for  ==  1,2, . . . ,  define  4  :  iV  W  by  4(n)  =  min{z  G  iV  |  =  1},  for 

all  71  G  where  superscript  (i)  denotes  z'-fold  repeated  application.  Finally,  for 
all  n  G  W,  take  a(n)  —  min{A  G  W  |  4(«)  <  3}. 

3  Static  Algorithms 

Given  an  MS  formula  ^  with  I  free  set  variables  and  without  free  simple  vari¬ 
ables  and  a  commutative  semiring  TZ  —  (R,  0,  0,  0, 1),  we  say  that  a  graph 
G  —  (E,  E)  is  appropriate  for  the  pair  (#,  7^)  if  each  a  E  V  U  E  has  a  boolean 
attribute  for  each  unary  predicate  symbol  occurring  in  ^  and  /  ring  attributes 
/i(a), ...,// (a)  G  R.  The  (static)  RMS  problem  defined  by  ^  and  11  is,  given 
a  graph  G  appropriate  for  to  compute  the  value  |G|^,7^  of  G  under  ^ 

and  U.  Courcelle  and  Mosbah  (1993)  show  that  every  RMS  problem  defined 
by  an  MS  formula  ^  without  free  simple  variables  and  an  efficient  commutative 
semiring  'll  can  be  solved  in  linear  time  on  graphs  of  bounded  treewidth.  Our 
dynamic  algorithms  are  based  on  a  different  proof  of  their  result,  which  uses 
techniques  of  Arnborg  et  al.  (1991),  and  which  we  now  sketch. 


297 


Theorem  1.  (Courcelle  and  Mosbah,  1993)  For  all  constants  t  >  I  and  all 
integers  n  >  1,  every  RMS  problem  defined  by  an  MS  formula  ^  without  free 
simple  variables  and  an  efficient  commutative  semiring  IZ  can  be  solved  in  0{n) 
time  on  n-vertex  input  graphs  appropriate  for  (^,7^)  and  of  treewidth  at  most  t. 

Proof.  Let  G  =  (Vg,Eg)  be  a  n-vertex  input  graph  appropriate  for  (^,7^)  and 
of  treewidth  at  most  t  and  take  i?  =  Vg  U  Eg-  Arnborg  et  al.  (1991)  show  that 
0(n)  time  suffices  to  construct  an  MS  formula!?'  with  the  same  free  variables  as 
0,  a  rooted  binary  tree  T*  =  {V*^E*)  appropriate  for  (!?',  7^),  and  an  injective 
function  n  :  O  ^  V*  so  that  the  following  holds:  Suppose  that  0  has  /  free  set 
variables.  Then,  for  all  Ai, . . . ,  A/  C  j?,  G  |=  ^[Ai, . . . ,  A;]  if  and  only  if  T*  |= 
!?'[7r(Ai), . . . ,  7r(A/)];  moreover,  T*  0[Bi^ . . .  ^  Bi]  whenever  |J!-i  2  7r(l?), 

Intuitively,  if  we  identify  a  and  7r(a),  for  all  a  G  f?,  then  T*  satisfies  0  under 
a  particular  assignment  (association  of  free  set  variables  with  sets  of  vertices 
and/or  edges)  if  and  only  if  G  satisfies  0  under  the  same  assignment. 

Informally,  a  finite  tree  automaton  is  the  natural  generalization  of  a  usual 
finite  automaton  from  inputs  that  are  strings  to  inputs  that  are  binary  trees. 
Formally,  we  can  take  a  finite  tree  automaton  to  be  a  5-tuple  (S^  sq,  F), 
where  5  is  a  finite  set  of  states,  17  is  a  finite  alphabet,  J  is  a  transition  function 
from  S  X  S  X  E  to  S,  So  G  S  IS  a,  distinguished  initial  state,  and  F  C  5  is  a 
distinguished  set  of  accepting  states.  Given  a  binary  tree,  each  of  whose  vertices 
is  labeled  with  an  element  of  E,  the  tree  automaton  assigns  a  state  to  each  vertex 
in  the  tree,  working  from  the  leaves  to  the  root  (i.e.,  processing  each  vertex  after 
all  of  its  children).  If  the  left  and  right  children  of  a  vertex  v  are  assigned  states 
s  and  t,  respectively,  and  v  is  labeled  a,  the  state  (5(s,t,(j)  is  assigned  to  I?;  if 
one  or  both  children  are  missing,  the  initial  state  sq  is  used  in  place  of  their 
states.  The  tree  automaton  accepts  the  input  tree  exactly  if  the  state  assigned 
to  the  root  belongs  to  F.  Arnborg  et  al.  (1991)  show  how  to  construct  a  tree 
automaton  M  —  [S,  E,S,  sq,  F)  with  the  following  property:  Suppose  that  the 
unary  predicates  appearing  in  0  are  Pi, . . . ,  Then  E  —  {false,  true]^'^^ ,  and 
for  arbitrary  subsets  Ai,...,Af  of  I/*,  if  each  vertex  v  E  V*  is  labeled  with 
the  bit  vector  (Pi(i’), . . . ,  Pk{v),bi, . . .  ,bi)  G  E,  where  bi  =  true  iff  i;  G  Ai,  for 
i  =  1, . . . ,  /,  then  M  accepts  T*  exactly  if  T*  ^  0[Ai, . . . ,  A;]. 

We  show  how  to  derive  from  M  another  tree  automaton  M'  =  {S' ,  E' ,  S',  Sq) 
to  solve  the  RMS  problem  at  hand.  M'  is  not  a  finite  automaton,  since  both  its 
alphabet  and  its  state  set  may  be  infinite,  and  it  will  compute  a  value  (namely, 
rather  than  just  accepting  or  rejecting,  for  which  reason  it  has  no  set  of 
accepting  states;  in  other  respects,  M'  behaves  exactly  as  a  finite  tree  automaton. 

Write  Tl  =  {R,  ©,  ©,  0, 1),  let  m  =  |5|,  and  identify  the  states  of  M  with  the 
integers  1, . . .,  m,  with  1  being  the  initial  state.  We  take  the  state  set  S'  of  M' 
to  be  the  set  of  vectors  of  length  m  with  components  in  R,  and  define  the 
initial  state  .Sq  as  (1,  0, . .  .,0).  The  alphabet  of  M'  is  E'  =  {false,  true}^  x  R^ , 
and  the  label  of  a  vertex  u  G  is  {Pi{v), . . . ,  Pk(v),  fi[v), . . . ,  fi{v)),  where 
/i , . . . ,  are  the  cost  functions  copied  to  T*  from  the  input  graph  G  according 
to  TT,  i.e.,  for  i  —  l,...,l,  fi{7r{a))  =  /i(«)  for  all  a  G  f?,  and  fi{a)  =  0  for 
all  a  G  V*  \  7r(f?).  We  next  define  the  transition  function  S'.  Assume  that  the 
states  of  the  (possibly  fictitious)  left  and  right  children  of  a  vertex  u  ^  V*  are 
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and  respectively.  Then  the  state  of  u  is 

where 


m  m 

00  0 

p:=l  q=l  {bi,...,bi)e{false,true}^ 

6{p,q,{Pi{u),...,Pk{u),bi,...,bi))=j 


0  /•■(“))’ 
l<i</ 
bi  =  true 


for  i  =  1, . . . ,  m.  It  can  be  seen  that  the  sum  for  the  jth  component  (correspond¬ 
ing  to  the  jth  state  of  M)  is  over  those  pairs  of  states  of  M  and  those  choices 
of  (non)membership  of  w  in  ,  A/  that  would  lead  the  original  automaton 

to  give  u  the  state  j,  for  j  1, . . m.  It  can  then  be  proved  by  induction  that 
for  each  vertex  u  €  V*  and  for  j  =  the  jth  component  of  the  state 

assigned  to  w  is 

Ai,...,AjCi7  i  =  l  aeA, 

M{u,Ai,...,Ai)=j 

where  U  is  the  set  of  descendants  of  u  in  T*,  and  M («,  ,  Ai)  denotes  the 

state  assigned  by  M  to  u  if  the  vertex  labels  are  set  according  to  ,  yl/.  If  the 

state  of  the  root  of  T*  computed  by  M'  is  (si, . . . ,  Sm),  this  observation  shows 
that  \G\^,n  -  \T* automaton  therefore  computes  and 
returns.  Since  m  is  a  constant  (for  fixed  ^),  each  application  of  8'  takes  constant 
time,  so  that  the  entire  processing  of  T*  by  M'  can  be  carried  out  in  0(n)  time. 


4  Data  Structures  for  Queries 

In  this  section  we  describe  data  structures  that  support  queries  efficiently,  but 
not  updates.  Given  an  MS  formula  #  without  free  simple  variables,  a  commuta¬ 
tive  semiring  7^,  and  a  constant  d  G  iV,  the  d-dimensional  RMS  query  problem 
defined  by  ^  and  %  is,  given  a  graph  G  appropriate  for  (^,7^),  to  preprocess  G 
for  subsequent  queries  for  quantities  of  the  form  \G'\^^n^  where  G'  is  obtained 
from  G  by  (temporarily)  changing  at  most  d  boolean  and/or  ring  attributes. 

Let  the  tree  T*  and  the  machines  M  and  M '  be  as  in  the  proof  of  Theorem  1 
and  consider  a  vertex  u  G  V*  with  left  and  right  children  t;  and  lu,  respectively. 
Let  (ri,...,rr„),  (5i,...,5^),  and  (fi,...,tm)  be  the  states  assigned  by  M'  to 
u,  V,  and  tv,  respectively.  Then,  by  definition  of  the  transition  function  we 
have  rj  =  e^=iCjpSp  for  j  =  1, . . . ,  m,  where 

m  . 

©  /i(^)), 

<?=1  {bi,...,bi)e{false,truey 

5ip,q,{Pi{u),...,Pkiu),by,...,bi))=j  b,-true 

for  p  =  IP-  In  other  words,  provided  that  the  state  of  w  remains  constant, 

the  function  that  maps  the  state  of  v  to  the  state  of  u  is  premultiplication  with 
an  m  x  m  matrix  (over  the  semiring  R).  We  call  this  function  the  relay  function 
of  the  edge  {u,  r;}  (for  the  input  graph  G).  The  relay  functions  of  edges  between 
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vertices  and  their  right  children  are  defined  in  complete  analogy  and  have  the 
same  form. 

Suppose  that  G  is  changed  into  G'  by  modifying  a  single  boolean  or  ring 
attribute  of  some  vertex  or  edge  a  e  This  translates  into  a  change  of  a  single 
boolean  or  ring  attribute  of  -  7r(a)  in  T*  or,  as  seen  from  the  point  of  view  of 
M',  into  a  change  of  the  label  of  We  can  compute  \G^y,n  by  simulating  the 
execution  of  M'  on  the  new  label  settings.  One  way  to  do  this  is  to  compose  the 
relay  functions  of  all  edges  on  the  path  from  v  to  the  root  r*  of  T* ,  and  then  to 
apply  the  resulting  function  to  the  new  state  of  i;;  this  yields  the  new  state  of  the 
root,  from  which  can  be  computed  in  constant  time.  Similarly,  a  query 

that  changes  the  labels  of  two  vertices  v  and  w  can  be  handled  by  composing 
relay  functions  along  the  two  paths  from  v  and  w  to  the  children  of  the  lowest 
common  ancestor  (LCA)  u  of  and  using  the  result  to  compute  the  new  state 
of  u,  and  then  propagating  the  change  to  r*  by  composing  the  relay  functions 
on  the  path  from  u  to  r*.  Answering  queries  therefore  essentially  reduces  to 
composing  functions  along  paths  in  T*,  a  problem  that  has  been  studied  in  a 
more  general  setting. 

Let  us  call  a  semigroup  <S  =  (5,  0)  efficient  if  a  ©  6  can  be  computed  from  a 
and  b  in  constant  time  for  all  a,  6  G  5.  We  can  assume  without  loss  of  generality 
that  S  contains  a  neutral  element.  In  the  context  of  a  tree  T,  each  of  whose  edges 
is  labeled  by  an  element  of  a  semigroup  (5,  ©)  called  its  weighty  we  define  the 
weight  of  a  (simple)  path  in  T  of  length  k  as  the  quantity  Ai  0  •  •  •  ©  where 
Xi  is  the  weight  of  the  zth  edge  on  the  path,  for  i  =  1, . . . ,  and  we  define  a 
path-weight  query  as  a  query  that  specifies  two  vertices  u  and  v  and  asks  for  the 
weight  of  the  (unique)  path  in  T  from  u  to  v.  The  following  lemma  was  proved 
by  Chazelle  (1987)  and  Alon  and  Schieber  (1987). 

Lemma  2.  For  all  n^k  e  IV,  an  n-vertex  tree  with  edge  weights  drawn  from 
a  efficient  semigroup  (5,0)  can  be  preprocessed  for  path-weight  queries  with 
preprocessing  time  0{nli^{n))  and  query  time  0{k). 

A  particularly  interesting  special  case  of  preprocessing  time  0{n)  and  query 
time  0(a(n))  is  obtained  by  choosing  k  —  a[n).  Similar  remarks  apply  below. 

Theorem  3.  For  all  constants  ^  >  1  and  all  integers  n^k  >  1,  every  t-dim.en- 
sional  RMS  query  problem  defined  by  an  MS  formula  0  without  free  simple  vari¬ 
ables  and  an  efficient  commutative  semiring  TZ  can  be  solved  on  n-vertex  input 
graphs  appropriate  for  (^,  'll)  and  of  treewidth  bounded  by  i  with  preprocessing 
time  0{nlk{n))  and  query  time  0{k), 

Proof.  We  preprocess  the  tree  T*  of  the  proof  of  Theorem  1  according  to  Lemma 
2,  the  weight  of  each  edge  being  its  relay  function  and  0  being  function  com¬ 
position  (i.e.,  matrix  multiplication  over  7^).  We  also  preprocess  T*  so  that 
subsequent  queries  for  the  LCA  of  two  arbitrary  vertices  can  be  answered  in 
constant  time;  it  is  known  how  to  do  this  in  0(n)  time  (Harel  and  Tarjan,  1984; 
Schieber  and  Vishkin,  1988). 

Suppose  that  a  query  changes  the  labels  of  the  vertices  in  some  set  U  C  V* 
(thus  IL”!  <  ^).  Let  Q  —  U  UW  U  {r*},  where  W  is  the  set  of  all  lowest  common 
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ancestors  of  two  vertices  in  U  and  r*  is  the  root  of  T*;  Q  is  still  of  bounded 
size.  Let  T  =  {V,E)  be  the  tree  obtained  from  T*  by  contracting  each  vertex 
in  V*  \  Q  into  its  closest  ancestor  whose  parent  belongs  to  Q.  With  the  aid  of 
still  more  LCA  queries,  to  determine  for  all  u,  v  G  Q  whether  u  is  an  ancestor 
of  V  in  T  can  be  constructed  in  constant  time.  We  now  process  T  from  the 
leaves  to  the  root,  for  each  vertex  v  inT  computing  the  new  state  assigned  to  v 
by  M'  after  the  label  changes  caused  by  the  update.  For  a  vertex  v  m  this  is 
trivial,  since  the  new  states  of  its  children  in  T*,  if  any,  will  be  known  when  v 
is  processed.  For  a  vertex  in  F  \  Q,  on  the  other  hand,  all  descendants  of  v  in 
T*  that  belong  to  U,  if  any,  are  descendants  of  a  single  vertex  in  Q  whose  new 
state  is  known  when  v  is  processed.  Thus  the  new  value  of  v  can  be  computed  in 
0(k)  time  by  composing  relay  functions  according  to  Lemma  2.  Once  the  new 
state  of  r*  is  known,  the  query  can  be  answered  in  constant  time. 

5  Dynamic  Data  Structures 

In  this  section  we  dynamize  the  path-weight-query  data  structure  of  Lemma  2 
to  allow  updates  of  edge  weights  as  well  as  path-weight  queries  and  state  the 
implications  for  dynamic  RMS  problems. 

Theorem  4.  For  all  integers  n,k,T  >  1,  every  n-edge  tree  with  edge  weights 
drawn  from  an  efficient  semigroup  (5,  ©)  can  be  preprocessed  for  path-weight 
queries  with  preprocessing  time  0{nlk{n)) ,  query  time  0{t  -|-  k),  and  update 
time  0(rr7.^/^). 

Proof.  We  reuse  part  of  a  scheme  developed  by  Chazelle  (1987)  in  order  to  prove 
Lemma  2.  For  a  parameter  m  with  1  <  m  <  n  to  be  chosen  below,  we  partition 
the  edge  set  E  of  the  input  tree  T  —  {V,E)  into  at  most  3n/m  sets,  each  of 
which  spans  a  subtree  of  T,  called  a  piece,  with  at  most  m  edges.  Chazelle  shows 
how  to  do  this  in  0{n)  time  (Lemma  3).  Call  a  vertex  of  T  a  fringe  vertex  if  it 
is  shared  between  two  or  more  pieces.  In  order  to  make  what  follows  clearer,  let 
us  assume  that  we  separate  the  pieces  by  replacing  each  fringe  vertex  v,  shared 
between  d  pieces,  by  a  star  consisting  of  a  central  vertex,  which  we  identify  with 
V,  connected  to  d  new  vertices;  each  of  the  d  new  vertices  is  associated  with 
a  different  piece  containing  v,  is  called  the  representative  of  v  in  that  piece, 
and  replaces  v  as  an  endpoint  of  each  edge  belonging  to  the  piece  and  incident 
on  v.  Provided  that  each  star  edge  is  given  a  weight  of  0,  the  neutral  element  of 
(5,  ©),  this  transformation  does  not  change  the  weight  of  the  path  between  any 
two  vertices  in  T.  It  at  most  triples  the  number  of  edges  and  is  easily  carried 
out  in  0{n)  time. 

The  number  of  fringe  vertices  is  bounded  by  Zn/m,  and  Chazelle  shows  how 
to  construct  an  edge- weighted  tree  T*  with  at  most  6n/m  edges  that  contains 
all  fringe  vertices  and  assigns  the  same  weight  as  T  to  the  path  between  any  two 
fringe  vertices;  T*  is  obtained  in  0[n)  time  from  T  by  removing  all  nonfringe 
vertices  that  have  fewer  than  three  incident  edges  lying  on  paths  between  fringe 
vertices  and  replacing  paths  of  such  vertices  by  single  edges  with  the  same  weight. 
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Each  piece  is  preprocessed  independently  for  path-weight  queries,  and  the 
global  tree  T*  is  processed  recursively  as  just  described.  For  all  ii^v  G  V,  denote 
by  yl(u,  u)  the  weight  of  the  path  in  T  from  u  to  v.  Consider  two  vertices  a  and  v 
in  T  and  let  x  and  y  be  the  first  and  last  fringe  vertices  on  the  path  in  T  from  u 
to  u,  respectively,  if  any.  If  x  and  y  do  not  exist,  u  and  v  belong  to  the  same  piece, 
and  the  weight  A  of  the  path  from  u  to  v  can  be  obtained  from  the  data  structure 
maintained  for  that  piece.  Otherwise  A  =  yl(u,  x)  ©  A(x^  y)  0  yt(y,  r;).  If  x  —  u 
(u  is  a  fringe  vertex),  yl(t/,  x)  =  0;  otherwise  A(u,  x)  =  A{u,  r^),  where  is  the 
representative  of  x  in  the  piece  of  u,  and  the  latter  quantity  can  be  obtained  from 
the  data  structure  maintained  for  the  piece  of  u.  A(y,v)  is  computed  similarly, 
and  y\(x,y)  is  obtained  recursively  from  the  data  structures  maintained  for  T* . 
One  small  issue,  how  to  determine  x  and  y  and  possibly  Xx  and  r^,  is  resolved 
with  the  help  of  yet  another  tree  T+,  obtained  from  T  by  replacing  all  edges 
within  each  piece  by  edges  from  each  (nonfringe)  vertex  in  the  piece  to  a  new 
vertex  representing  the  piece.  The  vertices  of  interest  occur  among  the  first  four 
and  the  last  four  vertices  on  the  path  in  T’^  from  u  to  u  and  can  be  identified 
by  two  applications  of  the  algorithm  of  Lemma  2:  The  weight  of  each  edge  is  its 
identity,  considered  as  a  string  of  length  1,  and  ©  is  ‘‘truncated  concatenation”, 
which  concatenates  its  two  arguments  but,  if  the  resulting  string  is  of  length 
>  4,  keeps  only  its  suffix  of  length  3. 

Without  loss  of  generality  assume  that  k  >  2,  On  the  first  recursive  level  we 
choose  m  —  ?77o  =  [  y/ h  (nJl  and  preprocess  the  pieces  for  path- weight  queries 
according  to  Lemma  2.  This  needs  a  total  of  0{nlk(n))  time  and  provides  a 
query  time  of  0(k).  On  all  subsequent  recursive  levels  we  choose  m  =  mi  = 
maxIfy^A/^^"^)],  12}  and  preprocess  the  pieces  for  path- weight  queries  according 
to  Lemma  2  with  A*  =  2,  ending  the  recursion  when  the  number  of  edges  drops 
below  12.  This  provides  a  query  time  of  0(1)  per  recursive  level,  and  since 
7770  =  the  preprocessing  effort  sums  to  0(r7)  over  all  levels.  Because  the 

recursive  depth  is  0{\ogn/\ogmi)  =  0(r),  the  overall  query  time  is  0(r+A).  An 
update  of  an  edge  weight  requires  recomputation  of  data  structures  maintained 
for  a  single  piece  on  each  recursive  level,  and  thus  needs  O(mo//c(n))  time  on  the 
first  level  and  0(7771/2(7771))  time  on  all  subsequent  levels,  resulting  in  an  overall 
update  time  of  0(r77^/^). 

Given  an  MS  formula^  without  free  simple  variables,  a  commutative  semiring 
7^,,  and  a  constant  d  E  iV,  the  d-dimensional  dynamic  RMS  problem  defined 
by  ^  and  IZ  is,  given  a  graph  G  appropriate  for  (^,7^),  to  preprocess  G  for 
subsequent  updates  of  single  boolean  or  ring  attributes  and  queries  for  quantities 
of  the  form  where  G'  is  obtained  from  (the  current)  G  by  (temporarily) 

changing  at  most  d  boolean  and/or  ring  attributes.  As  an  immediate  consequence 
of  Theorem  4  and  the  methods  introduced  in  Section  4,  we  obtain: 

Theorem  5.  For  all  constants  t  >  1  and  all  integers  n^k^r  >  1,  every  t-di- 
mensional  dynamic  RMS  query  problem  defined  by  an  MS  formula  ^  without 
free  simple  variables  and  an  efficient  commutative  semiring  IZ  can  be  solved  on 
n-vertex  input  graphs  appropriate  for  (^,7?-)  and  of  treewidth  bounded  by  i  with 
preprocessing  time  0{nlk(n)),  query  time  0(r-l-  k),  and  update  time 
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The  Name  Discipline  of  Uniform  Receptiveness 
(Extended  Abstract) 


Davide  Sangiorgi 
INRIA  -  Sophia  Antipolis,  France. 


1  Introduction 

The  TT-calculus  [9]  is  a  paradigmatical  process  calculus  for  message-passing  con¬ 
currency.  Two  processes  with  acquaintance  of  a  given  name  can  use  it  to  interact 
with  each  other.  Names  themselves  may  be  exchanged  in  communications,  which 
can  model  modifications  of  the  linkage  structure  among  processes.  These  are  the 
basic  process  constructs  (using  lower  case  for  names  and  upper  case  for  pro¬ 
cesses):  a(6).P,  the  output  of  6  at  a  with  P  as  continuation;  a{b).P,  an  input 
at  a  with  b  placeholder  for  the  name  received  in  the  input;  Pi  |  P2,  the  parallel 
composition  of  the  two  processes;  i/aP,  which  makes  name  a  local  to  P;  and  !P, 
which  denotes  a  potentially-infinite  number  of  copies  of  P  in  parallel. 

In  this  paper,  we  study  the  situation  in  which  certain  names  are  uniformly 
receptive.  A  name  x  is  receptive  in  a  process  P  if  at  any  time  P  is  able  of  offering 
an  input  at  x  (at  least  as  long  as  there  are  processes  that  could  send  messages  at 
.t).  The  receptiveness  of  x  is  uniform  if  all  inputs  at  x  have  the  same  continuation. 
Receptiveness  ensures  that  any  message  sent  at  x  can  be  immediately  processed; 
unformity  ensures  that  there  is  a  unique  way  in  which  a  message  at  x  may  be 
processed  (that  is,  the  input  end  of  x  is  “functional”). 

These  are  semantic  conditions,  and  are  undecidable.  To  obtain  decidable 
conditions  we  impose  some  restrictions.  Roughly,  we  guarantee  receptiveness  by 
demanding  that  the  name  is  available  in  input-replicated  form  as  soon  as  created. 
For  instance,  x  is  receptive  in 

Pi  i/x  (\x{p).  P  \  Q)  P2‘^=  i^x  {f{x).\x(p).  P)  (1) 

(On  the  right,  name  x  is  created  when  the  output  f{x)  is  consumed  since,  before 
this,  X  is  frozen.^)  We  guarantee  uniformity  by  demanding  that  there  is  only 
one  input  occurrence  of  the  name;  hence  in  (1)  name  x  should  not  occur  free 
in  input  position  in  P  and  Q.  To  preserve  the  uniformity  property  in  a  network 
of  processes,  we  then  also  demand  that  only  the  output  capability  of  the  name 
may  be  transmitted;  that  is,  as  all  7r-calculus  names,  so  uniform  receptive  names 
can  be  transmitted  but,  in  contrast  with  the  other  names,  they  can  be  used  by  a 
recipient  only  in  output  (retransmitting  the  name,  or  sending  a  message  at  it). 

In  the  processes  Pi  and  P2  above,  the  receptiveness  at  x  is  persistent,  which 
is  necessary  if  unboundedly  many  messages  could  be  sent  at  x.  It  is  useful  to 

^  Indeed  P2  is  behavioiirally  the  same  as  i/x  {\x{p).P  |  r(a;)),  which  is  of  the  same 
form  as  Pi. 
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consider  separately  the  case  in  which  at  most  one  message  can  be  sent.  Then 
the  replication  in  front  of  the  input  at  x  is  unnecessary.  We  call  the  first  form 
u -receptiveness,  the  second  linear  receptiveness. 

Uniform  receptiveness  corresponds  to  a  precise  discipline  in  the  usage  of 
names;  it  could  by  formulated  by  syntactic  means,  but  it  is  easier  and  more 
elegant  to  do  so  using  a  type  system  along  the  lines  of  type  systems  for  the 
TT-calculus.  The  impact  of  receptiveness  on  behavioural  equivalences  and  process 
reasoning  is  the  main  focus  of  this  paper.  We  shall  develop  some  theory  and  proof 
techniques  for  processes  with  receptive  names,  and  then  illustrate  their  usefulness 
by  means  of  some  non-trivial  examples,  like  the  proof  of  some  transformations 
that  introduce  parallelism  in  a  resource,  and  the  proof  of  the  correctness  of  an 
optimisation  of  the  translation  of  higher-order  process  calculi  into  the  7r-calculus 
[18,  13],  which  is  adopted  in  the  compiler  of  Piet  [12]. 

The  challenge  in  these  examples  is  that  the  equalities  implied  by  the  trans¬ 
formations  fail  in  the  ordinary  7r-calculus  (even  w.r.t.  the  very  coarse  notion  of 
trace  equivalence).  That  is,  there  are  contexts  of  the  ordinary  7r-calculus  that  are 
able  to  detect  the  difference  between  the  processes  of  the  equalities.  By  imposing 
the  type  system  for  receptiveness,  these  contexts  are  ruled  out  as  ill- typed. 

Uniform  receptiveness  often  occurs  in  the  7r-calculus.  Our  first  example  is  the 
coding  of  functions.  A  process  Q  with  a  local  function  Xr.M,  accessible  via  a 
name  z,  is  normally  written  i/z{\z{r,y).  P  \  Q)  where  P  is  the  coding  of  M  and  y 
is  (a  placeholder  for)  the  name  where  the  result  of  a  function  call  will  be  delivered. 
Within  Q,  a  call  of  the  function  with  argument  n  is  written  ux  {z{n,  x).x{p).  Q') 
where  p  is  (a  placeholder  for)  the  result  of  the  call.  In  the  function  declaration,  2: 
is  cj-receptive;  in  the  function  call,  x  is  linear  receptive.  Similar  combinations  of 
linear  and  a;-receptiveness  occur  in  the  coding  of  higher-order  communications 
and  of  Object-Oriented  languages.  Typically,  (j-receptiveness  occurs  in  the  mod¬ 
elling  of  resources  which  are  private  to  one  or  more  client  processes  (above,  the 
resource  is  a  function).  A  discipline  similar  to  cj-receptiveness  is  presently  used 
in  the  compiler  of  Piet  [12],  to  allow  optimisations  of  the  code  implementing 
communications.  An  important  example  of  linear  receptiveness  (indeed,  perhaps 
the  most  important)  is  found  in  process  interactions  based  on  the  Remote  Proce¬ 
dure  Call  (RPC)  paradigm.  An  RPC  interaction  involves  two  synchronisations 
between  a  caller  and  a  callee  where,  after  the  first  synchronisation,  the  caller 
waits  the  time  necessary  for  the  callee  to  elaborate  a  response.  When  we  are 
modeling  RPC’s  in  the  7r-calculus,  the  return  name  at  which  the  callee  deliv¬ 
ers  its  response  is  used  as  linear  receptive.  (The  function  call  above  too  is  an 
example  of  an  RPC  interaction.) 

As  behavioural  equivalence  on  processes,  we  use  barbed  equivalence.  This 
equates  processes  which,  very  roughly,  in  all  contexts  give  rise  to  the  same  pat¬ 
terns  of  interactions.  The  main  inconvenience  of  barbed  equivalence  is  that  it 
uses  quantification  over  contexts  in  the  definition,  and  this  can  make  proofs  of 
processes  equality  heavy.  Against  this,  one  looks  for  direct  characterisations, 
without  context  quantification.  For  instance,  in  CCS  and  in  the  ordinary  tt- 
calculus  barbed  equivalence  coincides  with  the  well-known  labeled  bisimilarities 
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[13].  (In  a  labeled  bisimilarity  the  bisimulation  game  is  played  not  only  on  silent 
actions,  as  for  barbed  bisimulation,  but  also  on  input  and  output  actions.) 

We  sketch  the  essential  points  of  our  theory  for  processes  with  receptive 
names.  The  schema  is  the  same  for  linear  and  for  u  receptiveness.  We  first 
introduce  a  type  system  which  forces  the  receptiveness  discipline,  and  prove  some 
basic  properties  for  it.  Secondly,  we  isolate  a  subclass  of  the  well-typed  processes, 
called  discreet  processes,  roughly  characterised  by  the  property  that  all  receptive 
names  which  are  emitted  are  private  to  the  sender.  Discreet  processes  are  defined 
by  means  of  syntactic  restrictions  on  the  output  prefix  similar  to  those  in  the 
language  ttI  [15].  Thirdly,  we  introduce  a  simple  but  powerful  algebraic  law, 
with  which  any  well- typed  process  can  be  transformed  into  a  discreet  process. 
Remarkably,  this  law  equates  a  process  whose  first  action  is  the  output  of  a  global 
name  with  a  process  whose  first  action  is  the  output  of  a  private  name.  The  law 
is  not  valid  in  the  untyped  7r-calculus,  but  it  is  valid  under  the  receptiveness 
type  system.  Finally,  we  prove  a  direct  characterisation  of  barbed  equivalence 
on  discreet  processes,  as  a  labelled  bisimilarity  called  receptive  bisimilarity.  The 
latter  differs  from  the  ordinary  bisimilarity  in  the  requirement  for  input  actions, 
but  otherwise  it  can  be  used  with  the  standard  co-inductive  techniques  of  labelled 
bisimilarities,  including  proof  techniques  such  as  “bisimulation  up  to  expansion”. 

For  lack  of  space,  some  definitions  and  most  of  the  proofs  are  omitted.  More 
examples  can  be  found  in  [17]. 


2  Some  background  on  the  7r-calculus 

We  use  lower  case  letters  p,q,r,. ..  to  range  over  names,  and  upper  case  letters 
P,Q,Rio  range  over  the  set  V  of  processes.  This  is  the  7r-calculus  grammar  (for 
simplicity,  we  develop  our  theory  on  the  monadic  calculus): 

F:=0  I  p(q).P  I  f{q).P  \  p(q)-P  \  \p  =  q]P 

I  P,  IP2  I  I'pP  I  P1+P2  I  '■Piq)-P 

We  allow  the  bound-output  prefix  p{q).P  in  the  syntax;  often  in  the  7r-calculus 
literature,  p(q).P  is  given  as  an  abbreviation  for  vqp{q).P.  We  use  a  to  range 
over  substitutions;  for  any  expression  E,  we  write  Ea  for  the  result  of  applying 
a  to  E,  with  the  usual  renaming  convention  to  avoid  captures.  We  assign  sum 
and  parallel  composition  the  lowest  precedence  among  the  operators.  We  write 
p.P  and  p.P  when  the  name  transmitted  at  p  is  not  important,  and  we  often 
abbreviate  a.  0  as  a.  The  labeled  transition  system  is  the  usual  one,  in  the  early 
style.  Actions,  ranged  over  by  p,  can  be  of  four  forms:  r  (interaction),  p{q}  (an 
input  at  p  in  which  q  is  received),  p(q)  (free  output)  and  p{q)  (bound  output).  In 
these  actions,  p  is  the  subject.  Free  and  bound  names  of  actions  and  processes  are 
defined  as  usual.  In  a  statement,  we  say  that  a  name  is  fresh  to  mean  that  it  is 
different  from  the  names  of  other  processes  or  actions  in  the  statement.  Relation 

is  the  reflexive  and  transitive  closure  of  ,  and  =>  stands  for  ^ 

P  Jfp  holds  if  there  is  P'  and  an  action  p  with  subject  p  s.t.  P  P' .  A  context 

C  is  static  if  it  has  the  form  i/p  {P  \  [•]),  for  some  P  and  p. 
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Definition  1  barbed  bisimulation,  equivalence  and  congruence.  Barbed 
bisimulation  is  the  largest  symmetric  relation  on  processes  s.t.  P  implies: 

1.  whenever  P  =>  P'  then  there  exists  Q'  such  that  Q  =>  Q'  and  P' 

2.  for  each  name  p,  P  iff  Q 

Two  processes  P  and  Q  are  barbed  equivalent^  written  P  ^  ii  for  each  static 
context  C  it  holds  that  C[P]  «  C[Q];  they  are  barbed  congruent,  written  P  ctQ, 
if  C[P]  ^  C[Q]  for  all  contexts. 

Barbed  equivalence  and  congruence  usually  coincide  with  the  ordinary  la¬ 
beled  (early)  bisimilarity  and  congruence  of  the  7r-calculus  [13].  The  proof  of 
this  fact  is  simple  on  the  class  of  the  image  finite  processes  (to  which  most  of 
the  processes  one  would  like  to  write  belongs)  by  exploiting  the  n-approximants 
of  the  labeled  equivalences.  We  recall  that  the  class  of  image- finite  processes  is 
the  largest  subset  X  of  P  which  is  derivation  closed  and  s.t.  P  G  J  implies  that, 
for  all  //,  the  set  {P'  :  P  ^  P'},  quotiented  by  a:  conversion,  is  finite. 

3  Linear  receptiveness 

The  discipline  of  uniform  receptiveness  (briefly  receptiveness)  can  be  added  to 
any  of  the  main  existing  type  systems  for  the  7r-calculus.  In  this  paper,  our 
base  type  system  will  be  Milner’s  sorting,  that  we  now  briefly  recall.  Names  are 
partitioned  into  a  collection  of  sorts.  Then  a  sorting  function  is  defined  which 
maps  sorts  onto  sorts  (in  the  polyadic  calculus  it  maps  sorts  onto  sequences  of 
sorts).  If  a  sort  5  is  mapped  onto  a  sort  t  this  means  that  names  in  s  may  only 
carry  names  in  t;  moreover,  t  is  the  object  sort  of  s.  In  the  remainder,  we  shall 
assume  that  there  is  a  sorting  system  under  which  all  processes  are  well-typed. 
We  separate  the  base  type  system  (Milner’s  sorting)  from  the  typing  rules  for 
receptiveness  so  as  to  show  the  essence  of  the  latter  rules. 

We  begin  our  analysis  of  receptiveness  from  the  case  of  linear  receptiveness. 
We  call  the  non-linear-receptive  names  plain  names.  There  are  no  constraints 
on  plain  names  except  those  imposed  by  the  underlying  sorting.  We  shall  omit 
the  adjective  “linear”  when  there  is  no  ambiguity.  For  simplicity,  we  assume 
that:  There  is  a  single  sort  L-recep  of  linear  receptive  names;  linear  receptive 
names  carry  plain  names.  These  two  assumptions  can  be  relaxed  without  dif¬ 
ficulties.  We  also  assume  the  existence  of  a  sort  trig  of  names,  different  from 
L-recep  but  with  the  same  object  sort  as  L-recep  (note  that  names  in  trig  are 
plain  names).  The  sort  trig  will  be  used  to  derive  simpler  characterisations  of 
our  bisimilarities.  In  the  remainder,  x,y,  z. . .  range  over  linear  receptive  names, 
a,  h, . .  over  plain  names,  and  v  over  names  in  trig.  We  recall  that  p,  q,  r  range 
over  the  set  of  all  names.  A,r  range  over  finite  sets  of  linear  receptive  names. 
We  sometimes  write  Z\  —  r  as  abbreviation  for  A  —  {x}  and  A,x  for  A  U  {r}, 
and  also  x  for  {r}.  The  type  system  for  linear  receptiveness  is  in  Table  3.  A  rule 
with  double  conclusion  is  an  abbreviation  for  more  rules  with  same  premises  but 
separate  conclusions.  Judgements  have  the  form  A^P  P.  As  sets,  the  order  in 
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0-  r  I-  p 

(T-inp-mat)  g.  |_  =  (,]p  (T-inp-2) 


(T-inp-3) 


(T-res-1) 


(T-out-1) 


x^r  0;  P  h  P 
x;  r  x{b).  P 

x^PUA  A,x]r^x\rP 
A^Phi^xP 

0;PhP 

0;PI-a(6}.P,  a{b).P 


(T-rep) 


(T-res-2) 


(T-oiit-2) 


x^r  0;r,xhP 
^  0;Pha(a:).P 

0;0|-a(6)P 
0;0  h  !a(6).P 

«^PUZi  A]P\-  P 


A-.Py-upP 

x^P  0;PhP 
0;  P,  x  h  a{x).  P 


,  ^  xgr  0;ri-p  x^r  x;rhP 

0;r,xhx(6>.F,  x(6).P  ^  ^  0;  r  h  a(x).  P 


(T-par) 


(T-nil) 


zii;PihPi  2d2;P2^'P2  AinA2  = 
Ai,A2;PuP2h  Pi  |P2 


Pi  n  P2  =  0 


(T-sum) 


P  h  Pi  0;  P  I-  P2 
0;  P  P  Pi  +  P2 


Table  1.  Typing  rules  for  linear  receptiveness. 


which  names  appear  in  Zi  and  P  does  not  matter.  Intuitively,  if  Z\;  P  h  P  then 
U  P  are  the  only  receptive  names  which  appear  free  in  P;  process  P  must 
use  any  name  in  P  exactly  once  in  output  position  (that  is,  either  performing 
an  output  at  that  name  or  transmitting  this  capability  to  another  process),  and 
names  in  A  immediately  and  only  once  in  input.  This  intuition  is  formalised  in 
Theorem  2,  which  relates  types  and  operational  semantics  of  processes.  We  say 
that  P  is  well  typed  if  there  are  A.P  s.t.  A^P  P  holds. 

Theorem  2  soundness  theorem.  Suppose  Zl;P  h  P. 


a. 

1.  if  X  e  A  then  for  all  a  there  is  a  unique  P'  s.t.  P  — ^  P' ; 

2.  If  P  P'  and  x  ^  P  then  Z\;  P,  2:  h  P' ; 

3.  if  P  P'  then  x  E  A  and  A  —  x^P  \-  P' ; 

a(b)  “(^)  , 

4.  if  P  P'  or  P  ^  P'  or  P  ^  P’,  then  T  h  P' ; 

5.  if  P  ^  P'  then  either  zl;  Th  P'  or  there  is  x  e  AnP  and  Zl-a;;P-x  h  P', 


x{a) 

P  ^  P'  or  P 


P',  then  x  £  r  and  Z\;  P  —  2;  h  P'; 


7.  if  P  — >  P'  and  x  ^  Au  P  then  Z\,  2:;  P  h  P'. 


Behavioural  equivalences  under  linear  receptiveness  As  usual  in  typed 
calculi,  the  definitions  of  the  barbed  relations  take  typing  into  account,  so  that 
the  composition  of  a  context  and  a  process  be  well- typed.  With  receptiveness,  an 
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additional  ingredient  has  to  be  taken  into  account,  namely  the  input  availability 
of  receptive  names.  If  a  process  has  the  possibility  of  using  certain  receptive 
names  in  output,  then  a  context  in  which  the  process  is  tested  should  guarantee 
the  input-availability  at  these  names,  otherwise  the  essence  of  receptiveness  — 
outputs  at  receptive  names  can  be  immediately  consumed  —  is  lost. 

Definition  3  complete  processes  and  contexts.  A  process  P  is  complete  if 
Z\;0  h  P,  for  some  A.  We  say  that  context  C  is  complete  on  (A'.P)  if  C[P]  is 
complete,  for  all  P  s.t.  zl;  P  h  P. 

Definition  4  barbed  equivalences  under  linear  receptiveness. 

Suppose  P  h  P,  Q.  Then  we  say  that  P  and  Q  are  barbed  equivalent  under  lin¬ 
ear  receptiveness  at  (Zi;P),  briefly  P  Q,  if  for  each  static  context  C  which 
is  complete  on  {A\  P)  it  holds  that  C[P]  ^  C[Q]  (where  is  is  barbed  bisim¬ 
ulation,  Definition  1).  Barbed  congruence  under  linear  receptiveness  at  (zl;P), 
briefly  — is  defined  similarly  —  just  remove  the  constraint  on  C  being  static. 

We  write  Z\;  P  h^)  P  if  Zi;  P  h  P  can  be  proved  without  using  rule  T-out-2; 
in  this  case  we  say  that  P  is  discreet.  In  a  discreet  process,  all  receptive  names 
which  are  exported  must  be  private:  Syntactically,  this  means  that  outputs  of 
global  receptive  names  are  disallowed  (that  is,  using  the  terminology  in  [15],  only 
internal  mobility  —  the  sending  of  fresh  names  —  is  allowed  on  receptive  names). 
We  write  p  >  g  as  abbreviation  for  a  process  p(r).g(r).0  (a  1-place  ephemeral 
buffer  from  pio  q).  We  can  transform  well- typed  processes  into  discreet  processes 
using  the  law 

b{x).  P  =  b{y).  {y  \>  X  \  P)  for  y  fresh  (2) 

This  law  makes  the  output  of  a  global  name  into  the  output  of  a  local  (i.e., 
private)  name.  The  law  is  not  valid  in  the  ordinary  7r-calculus,  but  it  is  valid 
under  receptiveness: 

Lemma  5.  //  Zl;  P  h  b{x).  P  and  y  is  fresh,  then  b{x).P  — Kv)-  iv  >  ^  \  P)- 

We  now  derive  a  characterisation  of  the  receptive  barbed  equivalence  as  a 
labeled  bisimulation  on  discreet  processes.  We  begin  by  defining  the  labeled 
bisimilarity  on  complete  discreet  processes.  We  say  that  an  action  p,  is  a  plain 
input  if  n  is  the  input  of  a  plain  name,  i.e.,  fi  =  p{a)  for  some  plain  name  a. 

Definition  6  linear-receptive  bisimilarity,  Xl.  Linear-receptive  bisimilarity 
is  the  largest  relation  Xl  on  complete  discreet  processes  s.t.  P  Xl  implies: 

u 

1-  if  P  ^  P'  with  bound  name  of  pi  (if  it  exists)  fresh  for  P  and  Q,  and  pi  is 
an  output  or  a  plain  input  then  there  is  Q'  s.t.  Q  Q'  and  P'  Xl  Q'; 

2.  if  P  — >  P'  then  there  is  Q'  s.t.  Q  ==>  Q'  and  P'  Xl  Q'; 

p{x) 

3.  if  P  — ^  P'  and  x  is  fresh  for  P  and  Q  then,  for  some  fresh  name  u,  there 

are  Q'  and  Q"  s.t.:  (a)  Q  ^  Q'\  (b)  ux  {x  >  v  \  Q')  ==^  Q"; 

(c)  ux  {x  >  V  \  P’)  Q". 
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The  main  novelty  of  receptive  bisimulation  is  the  use  of  a  process  x  >  v  in  the 
input  clause  (3).  To  understand  this  addition,  recall  that  x  represents  a  private 
receptive  name  that  the  observer  exports;  if  the  observer  behaves  as  a  well-typed 
process,  then  it  must  make  x  immediately  available  in  input,  as  a  process  of  the 
form  x(p).R.  It  is  perhaps  surprising  that  we  do  not  test  the  behaviour  of  the 
derivatives  P'  and  Q'  for  all  infinite  choices  of  the  process  x(p).R,  but  only  on 
a  single,  simple,  process,  namely  a  link  x  >  v. 

Definition  7  linear-receptive  bisimilarity  on  all  discreet  processes. 

Suppose  Z\;  r  ho  P,  Q.  Let  T  =  Zi  n  r  and  y  =  P  -  A  (therefore  P  =  ^  U 
and  let  v  be  fresh  and  pairwise  distinct  names  with  \y  \—\v  \. 

We  then  set  P  Q  if  {iyx,y){y  >  r;  |  P)  Xl  {ux,y){y  >v\Q). 

The  definition  makes  sense  because  processes  {h'x,y){y  >  v  \  P)  and  {ux,y){y  t> 
V  I  Q)  are  complete  and  discreet,  and  we  have  already  defined  Xl  on  this  class. 
Moreover,  since  on  complete  processes  Xl  is  preserved  by  structural  equality  and 
injective  renaming,  the  above  definition  does  not  depend  on  the  order  of  names 
in  X,  y  and  u,  and  on  the  choice  of  names  v. 

The  closure  of  barbed  bisimulation  w.r.t.  the  static  contexts  gives  the  or¬ 
dinary  (early)  labeled  bisimulation  [13];  the  closure  w.r.t.  the  complete  static 
contexts  gives  receptive  bisimulation.  The  proofs  for  the  ordinary  bisimulation 
can  be  adapted  to  receptive  bisimulation.  Here  are  further  useful  laws  for  re¬ 
ceptive  barbed  equivalence  that  are  easy  to  prove  using  the  labeled  bisimilarity 
x^’^,  and  that  are  not  valid  in  the  ordinary  7r-calculus: 

If  Z\;  P  h  T(p).  P,  then  x(p).P  x(p)  \  P.  (3) 

Suppose  that  Z\;  P  h  P,  Q,  for  some  A  and  P  with  x  e  A  —  P,  and  let  v  be 
a  fresh  name;  then 

P  Q  iff  h'x  {v  >  X  \  P)  i/x  (v  [>  X  \  Q)  (4) 

Suppose  that  A;  P  h  P,  Q,  for  some  A  and  P  with  y  G  P  —  A,  and  let  v  he  n 
fresh  name;  then 

P  Q  iff  i'y{yt>v\P)^f'^~'^uy{y>v\Q).  (5) 

Law  (3)  transforms  a  “synchronous”  output  into  an  “asynchronous”  one;  (4) 
transforms  a  global  input  into  a  local  input;  (5)  does  the  same  for  outputs. 

4  a;-receptiveness 

The  other  interesting  example  of  uniform  receptiveness  is  uj -receptiveness,  where: 
The  input  of  a  name  is  always  available,  and  always  with  the  same  continuation; 
there  are  no  limitations  on  the  utilisation  of  the  name  in  output.  A  simple  way 
of  ensuring  the  uniformity  condition  on  inputs  is  to  require  that  the  only  input 
occurrence  be  replicated,  i.e.,  of  the  form  \x(p).P. 
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When  adapting  the  theory  of  linear  receptiveness  to  cj-receptiveness,  there 
are  several,  but  not  surprising,  modifications  to  make.  In  the  typing  system,  the 
interpretation  of  a  judgement  A;  F  \-  P  is  now  that  P  must  make  names  in  A 
immediately  available,  in  input-replicated  form;  whereas  it  may  use  names  in  F 
arbitrarily  many  times  in  output.  We  only  show  the  new  version  of  rules  T-par 
and  T-out-2,  and  one  of  the  rules  for  replication: 

Ai-FhPi  A2;F\-P2  AinA2=^ 

Ai,A2\F  h  Pi  I  P2 

xeF  0;Pl-P  0;PhP 

0;Ph  a(a;>.P  a;;  P  h  !a;(6).  P 

In  the  definitions  of  the  typed  barbed  relations,  typed  labeled  bisimilarities 
and  the  algebraic  laws  for  the  a;-case,  the  main  modification  w.r.t.  the  linear 
case  is  that  the  links  p  >  q  have  to  become  persistent.  Using  for  barbed 

congruence  under  cj-receptiveness  at  A\  F,  law  (2)  becomes 

b{x).  P  Hy)-  ^  I  P)  for  y  fresh  (6) 


5  Examples 

Parallelisation  of  resources  We  can  use  linear  receptiveness  to  validate  trans¬ 
formations  that  increase  the  parallelism  in  processes.  In  the  processes  below,  we 
use  recursion,  polyadicity  and  communication  of  integers,  which  are  straightfor¬ 
ward  to  accommodate  in  the  theory  of  bisimulation  previously  developed  (recur¬ 
sion  can  be  coded  up).  Thus  m,  n  range  over  integers  and  variables  over  integers. 
Consider  the  process: 

Ai{b)  a{a:).  6(n,c).  iydc{d).x{n).Ai{d) 

A  client  can  interrogate  Ai{b)  at  a,  and  it  will  receive  at  the  return  channel  x 
an  integer  n  that  Ai{b)  has  received  at  another  channel  6  (this  channel  is  re¬ 
newed  at  each  cycle  using  c).  Interactions  between  Ai  and  the  clients  are  Remote 
Procedure  Calls  (RPC),  therefore  the  return  channels  are  used  according  to  the 
discipline  of  linear  receptiveness  (see  the  discussion  on  RPC  in  the  introduc¬ 
tory  section).  The  behaviour  of  Ai  is  strictly  sequential.  Let  us  introduce  some 
parallelism: 

^2(6)  a{x).b{n,c).v'd  (x{n).c(d)  \  A2{d)^ 

M{^)  a{x).ud  {b{n^c).c{d).x{n)  |  ^3(6/)^ 

Process  A2{b)  can  accept  a  second  request  at  a  before  the  answer  to  the  fist 
request  has  been  delivered;  however  answers  cannot  overtake  one  another  —  they 
are  delivered  in  the  same  order  in  which  the  requests  were  made.  Process  A3  (6) 
can  even  accept  a  request  before  receiving  an  integer  at  6;  answers  can  overtake. 

Let  now  I{n)  be  a  counter  I{n,b)  uc  b{n^c).  c{d).  I{n  -f-  l,d)  and  consider 
the  systems  (n  is  any  integer)  Si{n)  ub  {Ai{b)  \  /(n,6)),  for  i  G  {1,2,3}. 
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All  these  systems  are  distinguished  in  the  ordinary  7r-calculus  —  the  different 
degrees  of  parallelism  that  they  exhibit  are  observable.  We  can  prove  that  they 
are  equivalent  exploiting  the  linear  receptiveness  of  the  return  channels  x,y.  ^ 
For  Sl(n)  S2(7r)  one  proves  that  the  relation  composed  by  all  pairs  of  the 

foi-m 

(^Sl{n'>  I  p[v-(nj),S2(n')  [  ]~[w(ra,>) 

i=l  i=l 

for  some  channel  and  integers  is  a  XL-bisimulation  up  to  expansion. 

The  other  equalities  can  be  proved  in  a  similar  way. 

The  above  processes  are  simple.  A  more  interesting  example  of  parallelisation 
of  resources  is  Cliff  Jones’s  parallelisation  transformation  problem  [5].  We  analyse 
this  in  [16],  where  we  prove  Jones’s  transformation  using  a  combination  of  the 
techniques  for  linear  and  cj  receptiveness. 

Encoding  of  higher-order  process  calculi  We  now  present  an  example  with 
u;-receptiveness.  Below  x,y  are  supposed  to  be  uj-receptive  names.  We  prove  the 
correctness  of  an  optimisation  of  the  translation  of  higher-order  process  calculi 
into  the  7r-calculus  [13,  18].  In  a  higher-order  calculus,  terms  of  the  languages 
may  be  transmitted.  For  simplicity  of  presentation,  we  consider  the  simpler  case 
of  a  calculus  where  only  processes  may  be  communicated.  The  operators  are  those 
for  sending  a  process  {p{P\).  T2),  receiving  a  process  {p{X).  P),  process  variable 
(A"),  plus  the  usual  operators  of  restriction,  parallel  composition,  summation, 
replication.  This  calculus,  which  we  call  HOPC,  is  the  core  of  Plain  CHOCS 
[18],  and  is  a  second-order  fragment  of  the  Higher-Order  7r-calculus  [13].  Upper 
case  letter  X  ranges  over  process  variables.  A  process  is  closed  if  it  does  not 
contain  free  variables.  The  compilation  C  of  this  calculus  into  the  7r-calculus  in 
[13,  18]  acts  as  a  homomorphism  on  all  process  constructs  except  input,  output 
prefixes  and  process  variables  where  it  is  so  defined: 

CIp(P>.Q1  uxp{x).(\x.ClP}  I  C[(5])  for  X  fresh 
CMX).Qj^:Mp(x).ClQ])  cm  1^'  x.O 

In  the  compilation,  the  communication  of  a  process  P  is  translated  as  the  com¬ 
munication  of  a  private  name  which  acts  as  a  pointer  to  (the  translation  of)  P 
and  which  the  recipient  can  use  to  trigger  a  copy  of  (the  translation  of)  P.  These 
pointers,  introduced  in  the  compilation,  are  used  as  a;-receptive  names. 

In  [13],  the  correctness  of  compilation  C  is  established,  by  proving  that  it  is 
fully  abstract  w.r.t.  barbed  congruence  (that  is,  for  all  closed  HOPC  processes 
P  and  Q,  P  ~  Q  iff  C[P|  ~  C[Q]).  The  optimisation  that  we  consider  acts 
on  outputs  of  process  variables.  Let  us  call  O  the  optimised  compilation.  It  is 
defined  as  C  except  for  the  case  of  an  output  of  a  variable,  for  which  we  have: 

0[p{X)-<3I  =h{^>-0[Ql 

^  In  these  definitions,  also  name  c  is  linear  receptive.  We  do  not  need  this  fact  for  the 
proofs  (and  it  is  reasonable  not  to  use  it,  because  the  linear  receptiveness  of  c  is 
accidental  —  one  can  modify  the  definitions  so  that  c  is  not  linear  receptive.) 
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For  instance,  when  translating  p(X).q{X).0,  the  result  of  O  is  p{x).q{x).  0  while 
that  of  C  is  p{x).  i^yq{y).  ly-x.  0.  The  optimisation  avoids  us  one  level  of  indirec¬ 
tion  through  pointers.  This  optimisation  is  analysed  in  [13]  and  is  shown  to  be 
unsound  for  untyped  barbed  equivalence.  However,  we  can  show  that  the  optimi¬ 
sation  is  sound  if  we  take  into  account  the  receptiveness  of  names.  The  proof  is 
an  immediate  consequence  of  law  (6),  since,  for  all  P,  0|P]  can  be  transformed 
into  C|[P]  by  repeatedly  applying  the  law: 

Theorem  8.  Let  P  be  a  HOPC  process  with  free  variables  in  {Xi, . . . ,  X^},  and 
let  P  {.Ti, .  ..,Xnr}.  It  holds  that  C[PJ  OlP\. 

Combining  this  with  the  theorem  on  C  in  [13],  we  can  prove:  for  all  closed 
HOPC  processes  P  and  Q,  P  ^  Q  iff  C^[PI  — OlQj. 

In  an  expanded  paper  [17],  other  examples  of  application  of  cj-receptiveness 
are  reported:  The  proof  of  the  equivalence  between  the  target  processes  of  Mil¬ 
ner’s  two  encodings  of  call- by- values  A-calculus  into  7r-calculus  [8]  (this  is  a  novel 
result);  the  proofs  of  some  stronger  versions  of  7r-calculus  replication  theorems 

[10]  (these  results  were  already  proved  in  [10];  exploiting  receptiveness  we  get 
simpler  proofs). 

6  Final  remarks 

Several  type  systems  have  been  proposed  for  process  calculi.  The  most  relevant 
for  this  work  are  [10],  where  the  type  system  has  input/output  modalities  to  dis¬ 
tinguish  between  the  capabilities  of  reading  and  writing  on  names,  and  the  type 
systems  expressing  linearity  information  [3,  7,  4].  The  type  system  for  recep¬ 
tiveness  represents  a  refinement  of  [10]  and,  in  the  case  of  linear  receptiveness, 
also  of  [7].  Also  [10]  and  [7]  contain  studies  of  the  effect  of  types  on  process 
behaviours,  using  barbed  congruence.  The  proof  techniques  developed  in  this 
paper  are  easier  to  apply,  mainly  because  based  on  labeled  bisimilarities. 

Other  papers  with  results  on  behavioural  consequences  of  7r-calculus  types 
include  the  following.  [6]  defines  a  type  system  for  the  asynchronous  Tr-calculus 
that  guarantees  deadlock  freedom  in  certain  cases;  a  subsystem  of  this  system 
is  similar  to  ours  for  cj-receptiveness.  [19]  uses  a  type  system  where  types  have 
a  graph  structure  to  prove  the  full  abstraction  of  an  encoding  of  the  polyadic 
TT-calculus  into  the  monadic  calculus.  Graphs  allow  expressing  sophisticated  com¬ 
munication  protocols  but  introduce  some  complications  in  the  typing.  [14]  uses 
a  type  system  with  input/output  modalities  and  variant  types  to  guarantee  the 
adequacy  of  a  translation  of  a  typed  object-oriented  calculus  into  the  7r-calculus. 

[11]  studies  the  constraints  imposed  by  parametric  polymorphism. 

Some  of  the  ideas  in  this  paper  should  be  useful  to  develop  reasoning  tech¬ 
niques  for  other  type  systems,  in  particular  those  with  input/output  modalities 
and  with  linearity.  They  might  also  be  useful  in  cases  where  either  the  receptive¬ 
ness  or  the  uniformity  condition  fails;  for  instance  the  calculus  in  [2],  where  all 
names  are  uniform  but  not  necessarily  receptive,  or  that  in  [1],  where  all  names 
are  receptive  but  not  necessarily  uniform. 
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Abstract.  An  account  of  the  basic  theory  of  confluence  in  the  7r-calculus 
is  presented,  techniques  for  showing  confluence  of  mobile  systems  are 
given,  and  the  utility  of  some  of  the  theory  presented  is  illustrated  via 
an  analysis  of  a  distributed  algorithm. 


1  Introduction 

Confluence  arises  in  a  variety  of  forms  in  computation  theory.  It  was  first  studied 
in  the  context  of  concurrent  systems  by  Milner  in  [6].  Its  essence,  to  quote  [7], 
is  that  “of  any  two  possible  actions,  the  occurrence  of  one  will  never  preclude 
the  other” .  As  shown  in  the  works  cited,  for  pure  CCS  agents  confluence  implies 
determinacy  and  semantic-invariance  under  silent  actions,  and  is  preserved  by 
several  important  system-building  operations.  These  facts  make  it  possible  to 
guarantee  by  construction  that  certain  systems  are  confluent  and  to  exploit  this 
fact  fruitfully  when  analysing  their  behaviours.  A  more  general  study  was  made 
in  [1]  which  in  particular  clarified  the  relationships  among  various  notions  of  con¬ 
fluence  and  semantic-invariance  under  silent  actions,  and  illustrated  the  utility 
of  the  ideas  for  state-space  reduction  and  protocol  analysis;  see  also  [1]  for  fur¬ 
ther  references.  Confluence  of  value-passing  CCS  agents  was  studied  first  in  [18] 
and  later  in  [22]  where  consideration  was  given  to  conditions  under  which  con¬ 
fluent  systems  result  from  combinations  of  ‘semi-confluent’  agents  and  the  ideas 
were  utilized  to  show  determinacy  of  programs  in  a  fragment  of  a  concurrent 
imperative  programming  language. 

The  elaboration  of  techniques  for  reasoning  about  mobile  systems  expressed 
in  the  tt- calculus  [9]  and  variants  of  it  has  involved  extension  of  established  meth¬ 
ods  and  development  of  new  concepts  specific  to  the  richer  setting.  Stemming 
from  [8]  there  have  been  several  works  on  disciplines  of  name-use  respected  by 
agents,  sometimes  expressed  via  type  systems;  see  for  instance  [2,  15,  23,  25, 
20,  16].  Such  disciplines  contribute  much  to  the  effectiveness  of  7r-calculi  as  de¬ 
scriptive  formalisms  and  analytical  tools.  One  promising  strand  of  development 
concerns  varieties  of  confluence.  These  have  been  used  in  showing  determinacy 
of  systems  prescribed  by  concurrent  object-oriented  programs  [13],  in  justify¬ 
ing  optimizations  in  the  Piet  compiler  [3,  17],  and  in  proving  the  soundness  of 
transformation  rules  for  concurrent  object-oriented  programs  [4,  14].  The  aims 
of  this  paper  are  to  give  an  account  of  the  basic  theory  of  confluence  in  the 
TT-calculus,  to  develop  techniques  for  showing  that  mobile  systems  are  confluent, 
and  to  illustrate  the  utility  of  some  of  the  theory  presented  via  an  analysis  of  a 
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distributed  algorithm.  The  extension  of  the  theory  from  pure  and  simple  value¬ 
passing  agents  to  mobile  agents  is  at  some  places  fairly  straightforward:  we  then 
proceed  quickly,  drawing  attention  only  to  significant  points.  Due  to  the  richness 
of  name-passing,  however,  techniques  for  showing  mobile  systems  to  be  confluent 
are  more  involved.  This  paper  contains  a  sample  of  results  obtained  on  this  topic 
in  the  first  author’s  thesis  [12].  Independently,  Uwe  Nestmann  in  his  thesis  [11] 
has  developed  a  static  type  system  concerned  with  sharing  of  ports  (polarized 
names)  by  mobile  agents  and  shown  that  well- typed  agents  are  confluent. 

A  summary  of  the  paper  follows.  Preliminary  material  is  collected  in  the  next 
section,  while  in  section  3  the  basic  definitions  and  results  on  confluence  in  the 
TT-calculus  are  given.  Section  4  is  concerned  with  techniques  for  showing  that 
complex  systems  are  confluent.  The  final  section  is  devoted  to  an  illustration  of 
the  utility  of  some  of  the  theory  presented:  an  analysis  of  a  distributed  algorithm. 
Due  to  lack  of  space  all  proofs  are  omitted;  see  [12]  for  a  detailed  technical 
account. 

We  are  grateful  to  an  anonymous  referee  for  helpful  comments. 


2  Preliminaries 

In  this  section  we  recall  briefly  background  material  on  the  (polyadic)  7r-calculus 
[9,  8].  For  undefined  terms  and  explanation  we  refer  to  these  papers. 

We  assume  an  infinite  set  N  of  names,  ranged  over  by  lower-case  letters,  a 
partition  S  of  N  into  a  set  of  infinite  (subject)  sorts,  and  a  sorting  A  :  S  — > 
S*.  For  S  e  S,  A(S)  is  the  object  sort  associated  with  S.  The  agents  are  the 
expressions  given  as  follows  which  respect  the  sorting  A: 

P  0  I  TT.P  I  P  +  0  I  P|0  I  MP  1  A{y). 

Here  tt  ranges  over  the  prefixes  r,  and  in  the  latter  two  of  which  x  is 
the  subject  and  the  tuple  y  is  the  object.  In  a  prefix  x{^  the  occurrences  of  the 
pairwise-distinct  names  y  are  binding;  the  occurrence  of  y  in  (j/y)  is  also  binding. 
We  write  fn(P)  (resp.  bn(P))  for  the  set  of  free  (resp.  bound)  names  of  P,  and 
n(P)  for  the  set  of  all  names  occurring  in  P.  We  write  also  fn5(P)  for  the  free 

names  of  P  of  sort  S.  Each  agent  constant  A  has  a  defining  equation  A{x)  =  P 
where  fn(P)  C  +  and  x  are  pairwise  distinct.  We  regard  as  identical  agents  which 
differ  only  by  change  of  bound  names.  We  write  =  for  structural  congruence  of 
agents.  A  substitution  is  a  sort-respecting  mapping  from  N  to  N.  We  write  Pa  for 
the  agent  obtained  from  P  by  applying  the  substitution  a.  We  write  {vlx}  for  the 
substitution  which  maps  each  component  of  x  to  the  corresponding  component 
of  y  and  is  otherwise  the  identity. 

Here  we  give  the  behaviour  of  agents  by  the  early  transition  rules  [10,  19]. 
In  this  system  there  are  three  kinds  of  action:  input  actions  of  the  form  x{^', 
output  actions  of  the  form  {iyz)x{^,  where  the  set  S'  bound  names  of  the 
action  (which  is  omitted  when  it  is  empty)  satisfies  z  Cy;  and  the  silent  action 
r  representing  communication  between  agents.  We  write  bn  (a)  for  the  set  of 
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bound  names  of  the  output  a  and  set  bn  (a)  =  0  if  a  is  an  input  or  r.  We 
write  Act  for  the  set  of  actions.  The  subject /object  terminology  carries  over 
from  prefixes  to  visible  actions.  The  transition  rules  are  as  follows  where  n(o;) 
is  the  set  of  names  occurring  in  the  action  a.  The  third,  fourth  and  fifth  have 
symmetric  forms. 

1.  x{y).  P  P{^/y}  if  the  sorts  of  the  components  of  y  and  z  agree. 

2.  TT.  P  P  if  TT  is  T  or  x{^. 

3.  If  P  P'  then  PpQ-^P'. 

4.  If  P  ^  P'  then  P  |  Q  -A  P'  |  Q  if  bn((a)  nfn(g)  -  0. 

5.  If  P  p'  and  Q  Q'  then  P  |  Q  {i/T){P'  \  Q')  if  2^nfn(g)  =  0. 

6.  If  P  P'  and  y  ^  n(Q;)  then  {i'y)P  {uy)P' . 

7.  If  P  P'  and  e  ?  -  (?U  {*})  then  (,.w)P  P'. 

8.  If  P{y/S}  P'  and  A{x)  P  then  A{^  P' . 

We  write  for  the  reflexive  and  transitive  closure  of  — for  the  compo¬ 
sition  and  for  if  a  —  r  and  otherwise.  We  further  write 

P  g  if  P  g,  or  a  —  r  and  P  ~  Q. 

We  often  tacitly  assume  that  bound  names  of  actions  are  fresh.  (Early)  bisim¬ 
ilarity  is  the  largest  symmetric  relation  «  such  that  if  P  g  and  P  P',  for 

some  g',  g  Q'  and  P'  Q'.  Branching  bisimilarity  is  the  largest  symmetric 
relation  such  that  if  P  ~  g  and  P  P',  then  either  a  —  r  and  P'  ^  g,  or 
for  some  g',g'',  Q  =>  Q"  Q',  P  Q"  and  P'  ^  Q'.  The  standard  nota¬ 
tions  for  these  relations  have  a  dot  to  differentiate  them  from  the  congruences 
defined  as  bisimilarity  under  all  substitutions.  Since  we  do  not  consider  the  latter 
here  we  use  the  less  cumbersome  symbols.  Finally,  an  agent  P  diverges^  written 
Pt,  if  P  can  perform  an  infinite  sequence  of  r  actions;  otherwise  P  converges^ 
P ], ;  and  P  is  fully  convergent  if  for  each  derivative  P'  of  P,  P'|. 

3  Confluence 

In  [7]  confluence  for  pure  CCS  agents  was  defined  using  bisimilarity,  and  it  was 
shown  that  a  wide  range  of  behavioural  equivalences  coincide  on  confluent  agents. 
In  developing  a  theory  of  confluence  for  the  7r-calculus  we  choose  here  to  base 
it  on  early  bisimilarity.  The  connections  between  this  treatment  and  the  various 
other  possibilities  are  straightforward.  In  our  view,  in  applications  of  the  theory 
there  is  likely  to  be  little  substantial  difference  between  the  variants.  With  this 
choice  ‘determinacy’  can  be  defined  as  it  can  for  pure  CCS  agents. 

Definition  1.  P  is  determinate  if  for  each  derivative  g  of  P  and  action  a,  if 
g  ^  g'  and  g  g"  then  Q'  ^  □ 
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Note  for  instance  that  P  —  a{x).  {x{y).a{y).  0  +  b{y).0)  is  not  determinate  if  x,  b 

have  the  same  sort  as  P  -M-  Q  =  b{y).a{y).  0  +  ^(y)-  0  and  Q  has  non-bisimilar 
6{y)-derivatives.  As  in  pure  CCS,  an  agent  bisimilar  to  a  determinate  agent  is 
determinate,  and  determinate  agents  are  bisimilar  if  they  may  perform  the  same 
sequences  of  visible  actions.  The  following  lemma  summarizes  conditions  under 
which  determinacy  is  preserved  by  operators.  In  the  last  part,  sort(M)  is  the  set 
of  sorts  of  the  names  in  M. 

Lemma  2. 

1.  If  P  is  determinate  so  are  r.  P,  x{y).P  and  {i'y)P- 

2.  If  P  is  determinate  and  for  each  y  €  y ,  if  y  is  of  sort  S  then  fn5(P)  C  {y}, 
then  a{^.P  is  determinate. 

3.  If  each  'Ki.Pi  is  determinate,  no  tt^  is  r  and  no  two  of  the  tt^  are  inputs  or 

outputs  with  the  same  subject,  then  determinate. 

4.  If  Pi,P2  are  determinate,  fn(Pi)  nfn(P2)  =  0,  sort  (bn  (Pi))  nsort(n(P2))  =  0 

and  sort(bn(P2))  n  sort(n(Pi))  =  0,  then  Pi  |  P2  is  determinate.  □ 

The  condition  in  (2)  cannot  be  dropped:  consider  R  =  a:(y).a(y).0  -h  6(y).0 
where  x,  b  have  the  same  sort.  Clearly  R  is  determinate  but  P  =  a{x).R  above  is 
not  as  R{^/x}  is  not.  Note,  however,  that  if  x,  6  were  of  different  sorts,  P  would  be 
determinate.  Using  sorts  to  make  distinctions  among  names  in  this  way  is  often 
helpful  in  applications  of  the  calculus.  Similarly,  the  condition  in  (4)  cannot  be 
dropped:  as  in  CCS,  Pi,  P2  cannot  share  free  names  (consider  a.  0  |  a.  0),  but  in 
addition  in  the  mobile  setting  more  must  be  said  as  that^property  need  not  be 
preserved  under  transition;  for  instance  ii  P  =  'w{z).  z{x).b.O,  Q  =  a{y).  c.  0  and 

a,z  are  of  the  same  sort,  then  P  |  Q  R  =  a{x).b.O  |  a(y).c.  0  and  R  is  not 
determinate.  The  condition  in  (4)  ensures  that  a  bound  name  of  one  component 
cannot  be  instantiated  with  a  name  free  in  the  other. 

A  pure  CCS  agent  P  is  confluent  if  for  each  derivative  Q  of  it  and  distinct 
ol^(5^  (i)  if  Q  — y  Qi  and  Q  Q25  then  Qi  — Qi  and  Q2  ~  Qii 

and  (ii)  if  Q  Qi  and  Q  Q25  then  Qi  Qi  and  Q2  =>  Q2  ~ 
Q'l-  For  value-passing  CCS  agents  the  definition  must  be  refined  to  account 
for  different  inputs  with  the  same  subject  [18,  22].  This  holds  also  for  mobile 
agents  with  the  additional  point  that  data  received  are  names  which  may  be 
used  for  interaction:  consider  P  a(x).x(y).0  which  one  would  expect  to  be 

determinate  and  the  transitions  P  fe(y)-0  and  P  c(y).0^  In  the  tt- 
calculus  a  further  consideration  arises:  consider  P  ^  {yz){a{z).Q  \  6(0).  0)  and 

its  transitions  P  Pj  =  b(z).0  and  P  ^  ^  P2  =  a(z).0.  Note  that  Pi 

has  no  (z/2:)5(2:)-transition,  and  dually  for  P2.  In  our  view  P  should  none  the 
less  be  regarded  as  confluent.  To  give  the  definition  we  introduce  two  pieces  of 
notation. 

Notation  3  We  write  a  tx  /3  if  a  and  13  are  different  actions  and  are  not  both 
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inputs  with  the  same  subject.  The  weight  a[l3  of  action  a  over  action  (3  \s  a 
except  if  q;  =  {i/T)a{^  when  it  is  {lyz  —  bn{j3))a{^,  □ 

Thus  for  instance,  {iyyz)a{y,  z)[{iyz)b{x,z)  is  {vy)a{y,z).  We  then  have: 

Definition  4.  An  agent  P  is  confluent  if  for  each  derivative  Q  of  P  and  a,l3 
with  a  [X  /3,  (i)  if  Q  Qi  and  Q  Q2,  then  Qi  Q[  and  Q2  =>  Q2  ~  Qi> 

and  (ii)  if  Q  Qi  and  Q  Q2,  then  Qi  Q[  and  Q2  Q'2  ~  Qi-  ^ 

Thus  for  instance  P  =  {uz){a{z).  0  |  b{z).  0)  above  is  confluent  as  after  P 

Pi  =  b{z).  0  and  P  P2  =  a{z).  0  we  have  Pi  ^  0  and  P2  0. 

It  is  easy  to  see  that  an  agent  bisimilar  to  a  confluent  agent  is  itself  confluent. 
An  agent  P  is  r-inert  if  for  each  derivative  Q  of  P,  if  Q  — ^  Q'  then  ^  Q. 
By  a  generalization  of  the  argument  from  the  CCS  case  we  have: 

Lemma  5.  If  P  is  confluent  then  P  is  r-inert.  □ 

The  following  result  is  a  useful  characterization  of  confluence  in  which  only  single 
transitions  need  be  considered.  It  holds  only  for  fully  convergent  agents.  In  [1] 
it  was  observed  that  for  fully  convergent  (V-well  founded’)  agents,  r-inertness 
implies  confluence.  A  similar  observation  is  included  here. 

Lemma  6.  Suppose  P  is  fully  convergent.  Then  P  is  confluent  iff  P  is  r-inert 
and  for  each  derivative  Q  of  P  and  with  ax/?,  (i)  if  Q  Qi  and 

Q  Q2  then  Qi  ^  Q2,  and  (ii)  if  Q  Qi  and  Q  Q2,  then  Qi 

andQ2^Q2~Ql-  ° 

The  proof  shows  that  if  P  is  fully  convergent  and  r-inert  and  satisfies  (i),  then 
P  is  determinate.  The  assumption  that  P  is  fully  convergent  cannot  be  dropped: 
consider  P  =  a.  6.  0  +  r.  (a.  0  -f  r.  P).  It  is  easy  to  see  that  P  is  r-inert  and 
that  all  of  its  derivatives  satisfy  (i)  and  (ii).  However,  P  is  not  determinate. 

We  record  the  analogues  for  confluence  of  the  earlier  results  on  preservation 
of  determinacy  by  operators. 

Lemma  7. 

1.  If  P  is  confluent  so  are  r.  P,  P  and  {i'y)P. 

2.  If  P  is  confluent  and  for  each  y  €y,iiy  is  of  sort  S  then  fn5(P)  C  {^},  then 
a{^.  P  is  confluent. 

3.  If  Pi,P2  are  confluent,  fn(Pi)  H  fn(P2)  =  0,  sort(bn(Pi))  nsort(n(P2))  =  0 

and  sort(bn(P2))  0  sort(n(Pi))  =  0,  then  Pi  |  P2  is  confluent.  □ 

Of  course  here  the  guarded  summation  clause  is  missing. 

In  the  following  section  we  will  consider  further  techniques  for  showing  sys¬ 
tems  to  be  confluent.  Before  doing  so  we  consider  a  variant  of  confluence  based 
on  branching  bisimilarity. 
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Definitions.  P  is  confluent  if  for  each  derivative  Q  of  P  and  a,/?  with 
q;  IX  /3,  (i)  if  Q  Qi  and  Q  (32,  then  Qi  =>  (J^and  Q2  =>  (32  ^  ^5 

and  (ii)  if  Q  (3i  and  (3  (32,  then  (3i  Q[  and  Q2  ^ 

0^2  -01-  ° 

The  following  observations  were  made  in  [4].  Confluence  (for  non-mobile  labelled 
transition  systems)  based  on  branching  bisimilarity  was  also  considered  in  [1]  and 
observations  similar  to  some  of  these  made.  An  agent  P  is  r^-inert  if  for  each 
derivative  Q  of  P,  if  Q  Q'  then  Q'  ~  <3- 

Lemma  9. 

1.  If  P  is  ^-confluent  then  P  is  T~-inert. 

2.  If  P,  (3  are  r-inert  and  P  ^  Q  then  P  Q. 

3.  P  is  -inert  iff  P  is  r-inert. 

4.  P  is  confluent  iff  P  is  ^-confluent.  ^ 

In  contrast  to  these  coincidences,  to  obtain  a  satisfactory  notion  of  ‘partial’ 
confluence  which  is  not  r-inert  it  is  essential  to  base  the  theory  on  branching 
bisimilarity  rather  than  bisimilarity;  see  [4]. 

4  Confluence  by  construction 

A  main  motivation  in  [7]  for  studying  confluence  was  to  find  an  interesting  prop¬ 
erty  implying  determinacy  which  can  be  guaranteed  to  hold  simply  by  confining 
the  use  of  combinators  in  building  systems.  Work  elaborating  this  view  and 
showing  its  fruitfulness  has  been  described  in  the  Introduction.  Here  the  empha¬ 
sis  is  on  sample  results  of  this  kind  in  the  richer  setting  of  name-passing.  The 
approach  is  complementary  to  development  of  static  type  systems  as  in  [11,  20]. 
A  useful  definition:  an  agent  P  is  o-determinate  if  for  each  derivative  Q  of  P, 
there  are  not  two  distinct  output  actions  a,/3  with  the  same  subject  such  that 

Q  and  Q  The  first  result  gives  conditions  under  which  a  combination 
of  confluent  agents  is  confluent. 

Theorem  10.  Suppose  P  ~  (i^F)(Pi  |  . . .  |  P^)  where  each  Pi  is  confluent  and 
o-determinate.  Suppose  that  for  each  derivative  P'  =  {vz'){P[  |  . . .  |  P^)  of  P, 
no  name  occurs  free  in  more  than  two  components  of  P^  and  a  free  name  of  P' 
occurs  in  exactly  one  component  of  P' .  Then  P  is  confluent.  □ 

Note  that  in  this  theorem  it  is  not  possible  to  replace  ‘confluent’  by  ‘determinate’: 
consider  {ua){a.  0  |  (a.  0  +  6.  0)). 

It  is  often  the  case  that  although  the  components  of  a  system  are  not  them¬ 
selves  confluent,  the  constraints  they  place  upon  one  another’s  behaviour  ensure 
that  the  system  itself  is  confluent.  The  second  theorem  is  an  instance  of  this 
idea.  To  state  it  we  need  some  definitions.  We  refer  to  a  set  of  agents  closed 
under  derivation  as  a  system.  For  5  6  S  we  say  a  system  is  S- closed  if  none  of 
its  agents  may  perform  an  input  or  an  output  via  an  5-name. 
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Definition  11.  Suppose  S  and  S  =  Si ...  Sn  are  distinct  sorts  and  the  sorting 
A  is  such  that  A(5)  ==  (5)  and  no  Si  occurs  in  any  other  A(5^).  A  system  V  is 
S^  S- sensitive  if  there  is  a  partition  {V^  \  p  a  finite  subset  of  5i  x  . . .  x  5n}  of 
V  such  that: 

1.  'd  P  e  and  P  P'  where  a  is  not  an  input  or  output  via  an  S-name 
or  an  input  via  an  5t-name,  then  P'  € 

2.  if  P  G  and  P  P'  where  a  is  an  output  via  an  5-name,  then  a  — 

3.  if  P  G  P^  and  P  P'  where  a  =  x{zi,.  ..,Zn)  with  x  :  5,  then  at  most 
one  of  the  Zi  occurs  free  in  P'; 

4.  if  P  G  P^  and  P  P'  where  a  is  an  input  via  an  5i-name,  then  there  is 
z  —  (zi , . . . ,  Zn)  G  p  such  that  the  subject  of  a  is  Zi  and  P'  G 

Further,  P  is  5,  S- confluent  if  it  is  5,  5-sensitive  and  whenever  P  G  P^,  P  Pi 
and  P  P2,  then  unless  for  some  (zi,. ..  ,Zn)  E  p,  a  and  j3  are  inputs  via 

distinct  zi  and  Pi  =§■  P^'  and  P2  P2  ~  P/.  D 

We  then  have: 

Theorem  12.  Suppose  P  =  {v'z){Pi  |  . . .  |  Pn)  and  P  =  {Q  |  Q  is  a  derivative 
of  a  Pj)  is  5-closed  and  5,  5-confluent  with  partition  {P^}~.  Suppose  each  Pi  G 

P®  and  is  o-determinate.  Suppose  that  for  each  derivative  P'  =  {j/P){P{  |  . . .  | 
P/J  of  P,  no  name  occurs  free  in  more  than  two  components  of  P',  and  a  free 
name  of  P'  occurs  in  exactly  one  component  of  P' .  Then  P  is  confluent.  □ 

In  closing  this  section  we  mention  that  related  results  of  a  synthetic  nature  can 
also  be  obtained  for  useful  varieties  of  ‘partial  confluence’  as  described  in  the 
Introduction,  and  that  static  type  systems  as  in  for  instance  the  papers  cited 
earlier  complement  them  effectively. 

5  An  application 

The  aim  of  this  section  is  to  illustrate  the  utility  of  some  of  the  theory  presented 
via  an  analysis  of  a  distributed  algorithm.  It  is  a  variant  of  the  Propagation  of 
Information  with  Feedback  protocol  of  [21]  studied  in  [24].  Consider  a  network 
of  m  processes  connected  by  communication  links,  where  the  graph  having  the 
processes  as  nodes  and  the  links  as  edges  is  connected.  Each  process  stores  an 
integer,  its  value.  A  distinguished  process,  the  root,  conducts  the  interaction 
between  the  network  and  its  environment.  The  intended  behaviour  of  the  algo¬ 
rithm  is  that  on  receiving  a  request  from  the  environment,  the  root  should  emit 
to  it  the  value  of  the  network,  i.e.  the  sum  of  the  values  of  the  m  processes.  We 
proceed  to  give  and  explain  the  process-calculus  description  of  the  algorithm. 
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We  use  the  following  sorts:  E,  T,  D,  I,  0.  The  sorting  A  is  as  follows:  A(E)  = 
(T,D),  A(T)  =  (int),  A(D)  ==  (),  A(l)  =  (0),  A(0)  ^  (int).  Here  int  is  the  type 
of  integers;  we  allow  simple  arithmetic  expressions  in  the  descriptions  -  the 
foregoing  theory  extends  easily  to  accommodate  this.  It  is  intended  that  each 
process  passes  from  its  initial  quiescent  state  through  some  active  states  to  a 
final  inactive  state.  The  behaviour  of  a  non-root  process  is  described  as  follows, 
where  Q  represents  the  quiescent  state,  A  the  active  states,  I  the  inactive  state, 
and  £  is  the  empty  tuple. 


Q{e,v)  ^  i:^^-e{t,d).A{t,e-e,e-e,£,£,v) 
A{t,£,e,£,£,v)  t{v).I 


A{t,s,¥,d,p,v)  ='  E^^~(vt'd)e{t\d).A{t,s-eJ,d,p{t',d),v) 

+  ,d).  A{t,s,r  -  e,dd,p,v) 

+  S^^jd.  A{t,s,r,d-d, p,  v) 

+  {t',d),v  +  v') 

+  d.  -  (t',d),u)). 


In  Q(e,  u),  u  is  the  value  of  the  process  and  the  names  e  of  sort  E  represent  the 
edges  incident  on  it  in  the  network.  In  the  quiescent  state  the  agent  may  receive 
via  any  such  name  a  pair  of  names,  t  of  sort  T  and  d  of  sort  D.  It  discards  d 
and  undertakes  to  send  an  integer  along  t  which  it  does  when  it  has  all  but 
completed  its  activity  (second  and  third  clauses).  That  activity  is  described  in 
the  fourth  clause:  A{t,s,r,d,p,v)  represents  the  state  in  which  the  process  is 
storing  u,  has  yet  to  send  data  along  each  E-name  in  s,  has  yet  to  receiv^data 
along  each  E-name  in  r,  has  yet  to  send  a  signal  along  each  D-name  in  d,  and 
for  each  T-name,  D-name  pair  in  p,  has  yet  to  receive  either  an  integer  along  the 
T-name  or  a  signal  along  the  D-name. 

The  behaviour  of  the  root  is  given  as  follows: 

(3o(in,e,  u)  in(out).  ^o{out,  e  -  e,e  —  e,£,e,u) 

Ao(out,e,  £,£,£,  u)  out(?;)./o 

/o  '*=  0 

Ao (out,  s,  r ,  d, p,  u)  ... 


where  the  fourth  clause  is  as  for  A  but  with  ‘Aq’  in  place  of  ‘A’  and  ‘out’  in 
place  of  't\  Thus  the  root  behaves  similarly  to  the  other  nodes  except  that  it  is 
activated  by  receiving  along  the  name  in  of  sort  I  a  name  of  sort  0  via  which  it 
undertakes  to  send  the  network’s  value.  The  network  is  represented  by 


Po  =  (i^e)(Qo(in,eo,uo)  |  ili<i<m  u^)) 

where  e  are  the  E-names  representing  all  the  edges  and  for  each  i,  those 
incident  on  the  process.  We  will  prove  the  following  correctness  result: 
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Theorem  13.  Pq  ~  in(out).  out(t;).  0,  where  v  —  vi. 

The  algorithm  may  be  thought  of  as  consisting  of  two  phases.  In  the  first  a 
spanning  tree  for  the  network  is  established,  and  in  the  second  each  non-root 
process  passes  to  its  parent  the  sum  of  the  values  stored  in  its  descendants,  and 
the  root  then  emits  to  the  environment  the  network’s  value.  The  sending  by  Aq 
or  A  along  a  name  e  of  a  pair  t',d  of  fresh  names  is  an  invitation  to  the  receiver 
either  to  become  a  child  of  the  sender  and  to  undertake  to  send  it  an  integer 
along  t',  or,  if  the  receiver  is  already  active  (and  so  has  a  parent),  to  decline  to 
do  so  by  sending  a  signal  via  d.  A  process  sends  an  integer  to  its  parent  only 
when  it  has  determined  the  sum  of  the  values  of  its  descendants. 

First  we  give  a  characterization  of  derivatives  of  Po-  For  S  €  S,  in  an  agent 
of  the  form  {^TjUiZiy  we  say  there  is  an  S-path  between  components  Z'  and  Z” 
if  there  are  5-names  xi,. .  .^Xp  such  that  Xi  E  fn{Zi^  Zi-^i)  for  each  i,  Z'  =  Zi 
and  Z"  =  Zp+i. 

Lemma  14.  If  Pq  P  where  w  E  Act*  then  P  =  {petd){R  j 
where:  (a)  fn(P)  is  {in},  (out)  or  0,  and  in  and  out  may  occur  only  in  R  (the 
derivative  of  the  root  Qo);  (b)  no  name  occurs  free  in  more  than  two  components 
of  P;  (c)  if  a  T-name  occurs  free  in  a  component  of  P,  there  is  a  unique  T- 
path  between  that  component  and  R\  (d)  the  sum  of  the  integers  stored  in  the 
components  which  are  quiescent  or  active  is  the  network’s  value.  □ 

Some  useful  notation:  Pi  =  (i/^(i4o{out,eo,eo,6,£,^o)  |  = 

(i/^(Ao(out,£,e,6,e,u)  |  and  P^  =  (i/?)(/o  |  i7i<i<m/).  We  will 

later  show  that  P^  and  P^  are  derivatives  of  Pq.  We  use  P  to  range  over  deriva¬ 
tives  of  Po .  Key  in  proving  the  theorem  will  be  the  agents  of  the  form 

Q'[e,e,v)  e{t,d).  A{t,e  -  e,e  -  e,e,e,v) 

where  e  e  e.  Q'  is  similar  to  Q  except  that  it  may  be  activated  only  by  an 
interaction  along  the  specific  name  e.  Note  that,  where  ~  is  strong  bisimilarity, 

Q{e,u)~r^~0^(e,e,u).  (1) 

Let  T  be  the  set  of  agents  of  the  form 

To  =  ^Oj  "^0)  \  Tll<i<mQ 

where  ei,...,em-i  represent  a  spanning  tree  of  the  graph,  with  E  ei  for 
each  i.  Note  that  such  a  To  differs  from  Pq  just  in  having  Q'  where  Pq  has  Q: 
the  edge  via  which  each  non-root  node  will  receive  its  first  communication  is 
determined;  intuitively.  To  represents  the  fragment  of  the  behaviour  of  Po  in 
which  the  spanning  tree  is  given  by  those  edges.  Let  To  E  T.  Directly  from  (1) 
and  Lemma  14  we  have: 

Corollary  15.  If  To  T  where  w  E  Act*  then  T  =  {i/etd){R  \  iTi<i<m  W/) 
where  (a)-{d)  as  in  Lemma  14  (with  ‘T’  for  ‘P’)  hold.  □ 
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Some  useful  notation:  Ti  =  (i/e)(Ao(out,  eo,eo,6,£,uo)  |  Uj)), 

=  (i^^(^o(out,e,£:,e,£,u)  |  and  =  (z/^(/o  |  i7l<^<77^/)•  We 

use  T  to  range  over  derivatives  Tq.  We  analyse  To,  noting  first  that  it  has  a 
specific  behaviour: 


Lemma  16. 


^  in  (out)  _ ^  out(77) 

-to  ^  -t  1  -t  t/j  '  -t  a;  • 


□ 


We  now  have  the  key  observation  whose  proof  appeals  Theorem  12. 
Lemma  17.  To  is  confluent. 


^From  these  two  results  we  have: 

Corollary  18.  To  «  In(out).  out(u).  0.  C 

Having  used  confluence  to  analyse  the  behaviour  of  To  we  now  relate  it  to  that 
of  To-  We  say  T  and  T  are  similar  if  they  differ  only  in  that  where  P  has  a 
quiescent  component  Q,  T  has  a  quiescent  component  Q'. 

Lemma  19.  {(T,  P)  \  P  and  T  are  similar}  is  a  strong  simulation.  □ 

By  Lemma  16  and  19  we  have  that  Tq  Pi  T^  — T^^-  We  say  that  To 
is  compatible  with  a  computation  To  Pi  Pr  if  for  each  z,  if  aj  is  r 

and  arises  from  complementary  actions  d)e{t\  d) ^  e{t',d)  where  the  second 
is  performed  by  a  quiescent  component  0(ej,Uj),  then  in  Tq  that  component  is 
Q'{e,ej,Vj)\  i.e.  the  E-names  used  to  activate  components  in  the  computation 
are  those  via  which  the  Q '-components  of  To  may  be  activated. 

Lemma  20.  If  To  T  then  for  any  To  compatible  with  the  computation, 
To  T  with  T  and  T  similar.  D 

We  can  now  prove  the  theorem.  Since  To  ~  in(out).Ti  it  suffices  to  show  that 

Pi  7^  dut(u).  0.  We  have  seen  that  Pi  T^  ~  0.  Choose  one  such  computation 
and,  by  Lemma  14,  choose  Tq  compatible  with  it.  Then  not  {Pi  with  a  ^ 
out(u)  as  otherwise  by  Lemma  20,  (Ti  contradicting  Lemma  18.  Finally, 

and  for  the  same  reason,  not  (Ti  P[  7^).  ^ 

We  conclude  by  briefly  comparing  this  analysis  with  that  in  [24].  The  latter 
uses  a  static  1/ 0-automaton  model  [5]  of  the  algorithm  and  establishes  that  the 
fair  traces  of  the  automaton  representing  it  are  included  in  those  of  an  automa¬ 
ton  akin  to  the  agent  in(out).  dut(u).  0.  In  our  view  name-passing  and  careful 
use  of  sorts  allow  a  very  direct  and  perspicuous  description  of  the  algorithm’s 
behaviour:  the  construction  and  use  of  the  spanning  tree  are  manifest  in  the  de¬ 
scription.  Moreover  the  use  of  reasoning  techniques  involving  name-passing  aids 
the  analysis,  and  the  proof  illustrates  the  idea  that  when  studying  the  behaviour 
of  a  confluent  system  it  may  suffice  to  examine  in  detail  only  a  (small)  part  of 
it.  Finally,  here  the  correctness  criterion  is  bisimilarity,  rather  than  inclusion  of 
fair  traces. 
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Abstract.  The  paper  investigates  a  concurrent  computation  model,  chi 
calculus,  in  which  communications  resemble  cut  eliminations  for  classical 
proofs.  The  algebraic  properties  of  the  model  are  studied.  Its  relationship 
to  sequential  computation  is  illustrated  by  showing  that  it  incorporates 
the  operational  semantics  of  the  call-by-name  lambda  calculus.  Practi¬ 
cally  the  model  has  pi  calculus  as  a  submodel. 


1  Communication  as  Cut  Elimination 

Concurrent  computation  is  currently  an  open-ended  issue.  The  situation  is  in 
contrast  with  sequential  computation  whose  operational  semantics  is  formalized 
by,  among  others,  the  A-calculus  ([2]).  In  retrospect,  the  A-cal cuius  can  be  seen 
as  a  fallout  of  proof  theory.  Curry-Howard’s  proposition-as-type  principle  allows 
one  to  code  up  constructive  proofs  as  typed  terms.  At  the  core  of  the  construc¬ 
tive  logic  is  the  minimal  logic,  whose  type  theoretical  formulation  gives  rise  to, 
roughly,  the  simply  typed  A-calculus.  Now  the  untyped  A-calculus  is  obtained 
from  the  simply  typed  A-calculus  by  removing  all  the  typing  information. 

In  recent  years,  classical  proofs  have  been  investigated  in  a  computational  set¬ 
ting.  Girard  proposed  proof  nets  ([4])  as  term  representations  of  classical  linear 
proofs.  These  classical  terms  are  typed.  The  conclusion  of  a  proof  derivation  is  the 
type  of  the  proof  net  corresponding  to  that  proof  derivation.  The  computations 
of  these  terms  are  cut  eliminations  modeled  by  rewritings  of  graphs.  As  the  terms 
are  typed,  cuts  happen  between  nodes  of  correlated  types.  Abramsky’s  proof- 
as-process  interpretation  ([1,  3])  relates  proof  nets  to  processes.  At  operational 
level,  this  interpretation  is  supported  by  a  cut-elimination-as-communication 
paradigm.  It  looks  like  a  type-erasing  interpretation  similar  to  the  one  found  in 
a  constructive  world. 

This  paper  investigates  a  concurrent  computation  model  obtained  by  revers¬ 
ing  the  roles  of  proofs  and  processes  in  Abramsky’s  paradigm.  That  is  to  say 
that  we  regard  communications  as  cut  eliminations.  The  way  to  arrive  at  such  a 
model  of  communication  echoes  that  in  the  sequential  world.  First  we  take  the 
multiplicative  linear  logic  as  the  ‘minimal  logic’  in  a  classical  framework.  There 
is  nothing  canonical  about  this  choice.  As  the  typed  classical  terms  we  take  the 

*  Supported  by  NNSF  of  China,  grant  number  69503006. 
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proof  nets.  The  following  left  diagram  is  a  proof  net: 


The  first  step  towards  the  model  is  to  abstract  away  the  logical  aspect  of  proof 
nets  but  keep  its  proof  theoretical  content.  The  above  proof  net  becomes  the 
right  diagram  in  the  above.  There  are  two  kinds  of  edge  in  the  net.  So  the  sec¬ 
ond  step  is  to  transform  the  net  into  a  graph  with  only  directed  arrows: 


We  then  forget  about  the  typing  information  while  recording  positive  and  nega¬ 
tive  information  by  labels  on  arrows,  arriving  at  an  untyped  graph  (left  below). 


This  is  the  untyped  version  of  the  original  classical  typed  term.  Notice  that  there 
are  two  kinds  of  node  in  the  proof  net:  the  internal  nodes  and  the  conclusion 
nodes.  In  order  to  distinguish  them  in  the  untyped  graph,  we  label  the  conclu¬ 
sion  nodes  with  small  letters  (above  right).  We  call  graphs  of  this  kind  reaction 
graphs.  In  a  reaction  graph,  a  node  without  (with)  a  label  is  called  local  (global). 
Reaction  graphs  can  be  seen  as  the  underlying  graphs  of  proof  derivations  in  a 
generalized  and  distilled  form.  Computations  with  reaction  graphs  are  cut  elim¬ 
inations.  Here  is  an  example  of  two  consecutive  cut-eliminations: 


In  the  left  graph,  the  two  upper  nodes  show  up  opposite  polarities  to  the  left 
bottom  node.  This  cut  is  eliminated  in  the  first  reduction.  The  two  arrows  are 
removed  and  the  two  upper  nodes  are  coerced  with  the  resulting  node  labeled  by 
m.  In  the  middle  graph,  the  two  bottom  nodes  with  the  arrows  pointing  to  the 
node  labeled  m  form  a  cut.  The  second  reduction  eliminates  the  cut.  The  idea  of 
this  paper  is  to  think  of  these  cut-eliminations  as  communications.  To  develop 
the  idea,  we  need  a  process-like  notation  for  reaction  graphs.  Let  us  define  graph 
terms  by  abstract  syntax  as  follows:  G  :=  0  |  m[x]  |  rn[x]  \  {x)G  \  G\G^ .  Here  0  is 
the  empty  reaction  graph;  m[x]  and  rn[x]  are  respectively  the  following  graphs: 

0— @ 

{x)G  is  obtained  from  G  by  removing  the  label  x  from  G;  G\G'  is  the  amal- 
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gamation  of  G  and  G' ,  coercing  nodes  with  same  labels.  The  two  consecutive 
cut-eliminations  in  the  above  can  now  be  described  by  the  following  reductions: 

This  term  representation  gives  rise  to  a  calculus  of  reaction  graphs. 

The  calculus  of  graphs  only  deals  with  finite  computations.  To  achieve  Turing 
computability,  we  extend  the  language  with  standard  process  combinators.  The 
resulting  language  will  be  referred  to  as  ^-calculus,  where  x  stands  for  exchange 
of  information.  The  paper  initiates  a  study  of  this  computation  model. 

2  A  Model  for  Concurrent  Computation 

Let  A*  be  a  set  of  names  ranged  over  by  lower  case  letters  and  Af^=  {a  |  a  € 
be  the  set  of  conames.  The  union  jV  U  77  will  be  ranged  over  by  a.  Define  a  to 
be  m  (m)  whenever  a  is  m  (m).  Let  T  be  the  set  of  x-terms  defined  as  follows: 

P  :=  0  I  a[x].P  I  P\P'  I  {x)P  I  a(x)*P. 

Here  m[x].P  and  m[x].P  are  terms  that  must  first  perform  a  communication 
through  name  m  and  then  enacts  P[y/x]y  where  y  is  the  name  received  in  the 
communication.  In  {x)P,  the  (ar)-part  is  a  localization  combinator.  In  both  {x)P 
and  x  is  local.  The  set  of  local  names  appeared  in  P  is  denoted  by 

ln{P),  whereas  the  set  of  global  names,  or  non  local  names,  in  P  is  designated 
by  gn{P).  Set  n{P)  is  the  union  of  ln{P)  and  gn{P).  We  adopt  the  a-convention 
saying  that  a  local  name  in  a  term  can  be  replaced  by  a  fresh  name  without 
changing  its  syntax. 

The  effect  of  a  substitution  [yi/xi] . . .  [yn/^n]  on  a  term  is  defined  as  follows: 

P[yilXi\...[ynlx„]  (..•/’[s/i/xil.-Obn/^n]-  Substitutions  will  be  ranged 
over  by  cr. 

For  simplicity,  a  structural  congruence  is  imposed  on  the  members  of  T. 

Definition!.  The  relation  =  is  the  least  congruence  on  x-terms  that  contains: 

(i)  P|0  =  P,  P1IP2  =  P2IP1,  and  Pil(P2lP3)  =  (PilP2)IP3; 

(ii)  (a;)0  =  0,  (x)(y)P  =  (y)(x)P,  and  (x)(P|<3)  =  PI(x)Q  if  x  ^  ffn(P); 

(iii)  P  =  Q  if  P  and  Q  are  a-convertible. 

We  regard  =  as  a  grammatic  equality.  So  P  =  Q  means  that  P  and  Q  are  syn¬ 
tactically  the  same.  The  operational  semantics  of  the  language  can  be  defined  in 
terms  of  a  labeled  transition  system.  We  prefer  however  a  reductional  semantics 
for  x-calculus  in  the  style  of  [5]: 

(x)(K|a[x].P|a[y].Q)  —  {x)(R[y/x]\P[ylx\\Q[ylx\) 

a{x)*P\^y].Q  a{x)*P\P[y/x]\Q 

P^P'  P-*P' 

P\Q^P'\Q  (x)P-(x)P'- 
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To  help  understand  the  communication  rules,  we  now  give  some  examples,  as¬ 
suming  X  and  y  are  distinct; 

(a:)(i2lm[?/].P|m[j:].Q)  R[y/A\P[yl^]\Q[yl^] 

m[y].P|(z)(i?|mW.Q)  P\R[ylx]\Q[y/x] 

{y)i'^[y]-P\i^){R\^[^]‘Q))  (y){R\Rly/^]\Q[y/^]) 

{x)m[x].P\{y)m[y].Q  —  {z)(P[z/x]\Q[z/y]),  where  z  is  fresh 
(a;)(m[x].P|m[x].Q)  (x)(P|0). 

It  is  clear  from  these  examples  that  the  localization  operator  in  x-calculus  acts 
as  an  effect  delimiter.  A  communication  either  instantiates  a  local  name  by  a 
global  name  or  identifies  two  local  names. 

Let  (— ►*)  be  the  (reflexive  and)  transitive  closure  of  We  will  denote 

by  X  a  sequence  xi, . . . ,  Xn  of  names.  We  will  also  abbreviate  (xi) . . .  (x„)P  to 
(x)P.  When  the  length  of  the  sequence  x  is  zero,  (x)P  is  just  P. 

3  Algebraic  Properties  ! 

! 

To  study  the  algebraic  semantics  of  x-terms,  a  labeled  transition  system  is  de¬ 
fined  as  follows,  where  6  ranges  over  {^,  \q  E  A*  U  x  ^  A}; 

(!/)(ii|a[t/].P)  ^  (iJ|P)[2:/y]  a{y)*P  ^  a{y)*P\P[x/y]  a[x].P°^  P 

P  P'  P-Lp'  ln(6)  n  gn{Q)  =  ^  P P'  xjuji) 

P\Q^P'\Q  {x)P-^{x)P'  ■ 

In  the  rules,  ln{b)  is  {x}  when  8  is  a(x);  it  is  the  empty  set  otherwise.  n{8)  is 

the  set  of  names  in  8.  Let  ^  denote  relation  — >•*. 

A  bisimulation  equivalence  for  x-terms  should  take  into  account  the  distin¬ 
guished  feature  of  the  localization  operators  of  the  language.  The  equivalence 
we  introduce  in  this  section  is  based  upon  the  old  idea  that  two  terms  are  con¬ 
sidered  observationally  equivalent  if  and  only  if  placing  them  in  a  same  context 
results  in  two  observationally  equivalent  terms.  Working  explicitly  with  contexts 
is  unnecessary  in  our  setting  due  to  the  presence  of  the  structural  equality  =. 

Definition  2.  Suppose  7^  C  T  xT.  The  relation  7^  is  a  local  simulation  if  when¬ 
ever  PHQ  then  for  any  term  R  and  any  sequence  x  of  names  it  holds  that 

(i)  if  (x)(P|7?)  — >  P'  then  Q'  exists  such  that  (x)(Q|P)  Q‘  and  P'TZQ'; 

(ii)  if  (x)(P|P)  P'  then  exists  such  that  (x)(Q|P)  4^  Q'  and  P'lZQ^ 

The  relation  7^  is  a  local  bisimulation  if  both  71  and  its  inverse  are  local  simu¬ 
lations.  The  local  bisimilarity  «  is  the  largest  local  bisimulation. 
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As  usual,  local  bisimuiation  up  to  %  is  a  useful  tool  for  proving  two  x-ferms 
being  locally  bisimilar.  We  omit  the  standard  definition. 

In  the  rest  of  this  section,  we  prove  that  ss  is  a  congruence  relation.  The  fact 
that  «  is  closed  under  parasition  and  localization  combinators  can  be  proved 
already  at  this  point. 

Propositions.  If  P  ^  Q  then  (i)  P\0  ^  Q\0  and  (ii)  {x)P  «  {x)Q. 

The  next  lemma  is  crucial  in  showing  that  «  is  a  congruence  relation.  It  is  the 
first  indication  that  local  bisimilarity  is  algebraically  appropriate.  The  property 
is  not  enjoyed  by  local  bisimilarity  for  t- pro  cesses. 

Lemma 4.  If  P  ^  Q  then  Pa  «  Qa  for  an  arbitrary  substitution  a. 

Proof.  Let  Tv  be  the  union  of  «  and  the  following 

{P  ^  Q,  R  €T,  za  sequence  of  names,  'I 
({z)(Pa\R),  {z){Qa\R))  a  a  substitution  [t/i/xi] .  ..[yn/xn]  such  >  . 

that  xi, . . . ,  Xn  are  pairwise  distinct  J 

Suppose  {z){Pa\R)1l{z)(Qa\R)  and  {z){Pa\R)  -i-  P',  where  a  is  the  substitution 
[yi/xi] . .  ’[Vnlxn]  with  xi, . . . ,  x„  being  pairwise  distinct.  Let  a  and  b  be  fresh 
names.  Then  for  the  sequence  z  of  names 

(z)((x)(a)(6)(6[6].P|a[xi] . a[xn]|a[2/i] . ^yn]-Kb])\R)  (z){Pa\R) 


As  b  ^  gn{P,Q),  b[b].P  ^  b[b].Q  follows  easily.  By  Proposition  3, 

(x)(a)(6)(6[6].P|a[xi] . a[x„]|%i] . a[t/n].^M) 

%  (x)(a)(6)(6[6].<5|a[xi] . a[xn]\a[yi] . a[yn].^M)- 

So  by  definition,  there  exists  some  Q'  such  that  P'  «  Q'  and 

(z)((x)(a)(6)(6[6].Q|a[xi] . a[xn\\a[yi] . a[yn]-b[b])\R)  A  Q' . 

During  the  above  reduction  every  a[x,]  must  have  reacted  upon  a[?/i],  for  1  < 
2  <  n,  and  b[b]  upon  b[b].  It  can  be  easily  proved  that  all  the  communications 
through  a  and  that  through  b  can  happen  in  the  very  beginning.  That  is 

(z)((x)(a)(6[6].Q|a[xi] . a[xn]\a[yi] . a[yn].^b])\R)  {z){Qa\R) 

A  Q'. 

So  {z){Pa-\R)  P'  is  matched  by  {z){Q<t\R)  A  Q' .  The  case  when  (z)(P<T|i?)  ^ 
P'  is  similar.  So  7^  is  a  local  bisimulation.  It  follows  that  P  ^  Q  implies  P[y/x]  ^ 
Q[y/x].  Therefore  P  ^  Q  implies  Pa  ^  Qa  for  a  substitution  a.  □ 

We  now  come  to  the  main  result  of  the  section. 
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Theorems.  «  is  a  congruence  equivalence:  ifP^Q  and  O  ET  then 
(i)  a[xlP  ^  ci[x].Q;  (ii)  P\0  ^  Q\0; 

(in)  ix)P  {x)Q;  (iv)  a(2:)*P  «  a{x)^Q. 

Proof.  We  sketch  the  proof  of  (iv).  The  proof  of  (i)  is  simpler.  Let  11  be 

{((x)(m(y)*Pli?),(x)(m(2/)*(9|i?))  ReT,  m,x  names}. 

Suppose  (x)(m(y)*P|i?)  —  P'  and  that  (x)(m(y)*P|P)  --  P'  is  caused  by 
a  communication  between  Tn{y)^P  and  R.  Then  P'  is  (x){m(y)*P|P[a/y]|PO' 
Similarly  (x)(m(y)*Q|P)  —  (x)(m(y)*Q|Q[a/y]|PO-  By  Lemma 4,  P[a/y]  « 
Q[a/y].  By  Proposition  3,  {^){m{y)*Q\P[a/y]\R')  »  (x)(m(y)*QlO[a/y]|P')- 
is  then  easy  to  see  that  7^  is  a  local  bisimulation  up  to  □ 


4  TT-Processes  as  x-Terms 

A  question  naturally  arises  as  to  the  relationship  between  ^-calculus  and  X“ 
calculus.  We  give  a  first  answer  in  this  section.  Let  V  be  the  set  of  7r-pro cesses 
defined  as  follows:  P  :=  0  |  m{x).P  \  mx.P  \  P|P'  |  {x)P  \  m(ar)*P.  We  refer 
the  reader  to  [6]  for  background  material  on  7r-calculus. 

There  are  many  bisimulation  equivalences  on  7r-processes.  What  is  most  rel¬ 
evant  in  this  section  is  the  open  bisimilarity  defined  in  [8].  Actually  we  will  use 
a  version  of  open  bisimilarity  stronger  than  Sangiorgi’s. 

Definitions.  Let  7^  be  a  binary  relation  on  the  set  of  7r-processes.  The  relation 
1Z  is  an  open  bisimulation  if  whenever  PRQ  then  for  any  x-process  R,  any 
sequence  x  of  names  and  any  substitution  <t  it  holds  that 

(i)  if  (x)(Pcr|P)  4-  P'  then  Q'  exists  such  that  (x)(Q(t|P)  A  Q'  and  P'RQ'] 

(ii)  if  (x)({3(t|P)  a  Q*  then  P'  exists  such  that  (x)(P(t|P)  A  P*  and  P'RQ^ 
The  open  bisimilarity  is  the  largest  open  bisimulation. 

is  a  congruence  equivalence  and  is  closed  under  substitution. 

A  structural  translation  from  t  to  x  bas  as  nontrivial  clauses  the  following: 

{m{x).Py  (a;)m[x].P®, 

{rnx.Py  m[a:].P°. 

Imposing  on  P  a  same  structural  congruence  as  given  in  Definition  1,  one  has 

Theorem  7,  For  P,Q  ^  T*,  ii  holds  that 

(i)  P-^Q  tffP'’  -*  <3°;  (ii)  P’^QiffP” 

(m)  p"^  Q  iff  P^"^^  Q” ;  (iv)  P  Q  iff  (3“ . 

Theorem  8.  For  P,Q  €V,  P  Q  iff  P^  ^ 
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5  Call-by-Name  in  X“C^iculus 

A  concurrent  computation  model  has  to  answer  the  question  of  whether  it  cap¬ 
tures  sequential  computation  successfully.  The  issue  is  often  addressed  by  re¬ 
lating  variants  of  A-calculus  to  the  model.  Our  focus  in  this  section  is  on  the 
call-by-name  A-calculus  ([7]),  whose  semantics  is  defined  by  the  following  rules: 

M  ^M'  M  ~*M' 

{XxM)N  M[N/x]  MN~^M'N  Xx.M 

The  following  translation,  which  is  Milner’s  encoding  of  the  lazy  A-calculus  with 
modification,  serves  as  an  encoding  of  the  call-by-name  A-calculus  in  x-calculus: 

[xju  x[u] 

[Aar.Mju  (u)(3:)(w[x].u[v]||[M]v) 

{MNju  (v)(x)(lAf|u|u[x].v[ti].x(it;)*|A'|'u;). 

The  parasition  oi  u[x]m[v]  and  {M^v  in  [Ax.M|u  allows  {Mjv  to  evolve  inde¬ 
pendently,  thus  modeling  reduction  under  A- abstraction.  The  encoding  preserves 
the  operational  semantics  of  the  call-by-name  A-calculus  in  the  sense  the  oper¬ 
ational  semantics  of  the  lazy  A-calculus  is  preserved  by  Milner’s  encoding  ([5]). 
A  formal  treatment  is  omitted  in  this  extended  abstract. 

The  call-by-name  A-calculus  is  one  example  which  can  not  be  treated  suc¬ 
cessfully  in  TT-calculus. 


6  Towards  an  Integration  of  x  ^ 

There  are  two  problems  one  encounters  when  trying  to  simulate  the  operational 
semantics  of  the  full  A-calculus.  The  first  is  how  to  model  reduction  under  A- 
abstraction.  The  second  is  how  to  model  reduction  MN  —^MN'  caused  by 
^  jV'.  The  former  is  to  do  with  parallel  computation.  There  is  no  reason  why 
it  should  pose  any  problem  for  concurrent  computation.  This  view  is  supported 
by  the  result  in  Sect.  5.  The  latter  is  to  do  with  recursion  because  the  A-term  N 
may  be  duplicated  in  future  reduction.  In  any  structural  interpretation,  this  N 
must  be  translated  into  the  body  of  a  replicator  or  guarded  recursion.  So  if  the 
N  induces  an  infinite  reduction,  the  interpretation  of  MN  would  have  no  termi¬ 
nating  reduction  sequences.  It  is  our  view  that  the  second  problem  is  orthogonal 
to  concurrent  computation.  It  is  caused  essentially  by  the  incompatibility  of  the 
two  recursion  mechanisms. 

In  this  section  we  take  a  look  at  a  higher  order  calculus  combining  the  com¬ 
munication  mechanism  of  the  X'^^^-^culus  and  the  recursion  mechanism  of  the 
A-calculus.  The  purpose  of  this  investigation  is  to  see  if  the  two  mechanisms  fit 
coherently  and  if  local  bisimulation  suffices  as  a  tool  for  studying  the  algebraic 
properties  of  the  language. 
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6.1  X  with  Call- by- Name  A 

Let  the  set  H  of  higher  order  x-terms  be  defined  by  the  following  abstract  syntax: 

E-X\  a[xlE  I  E\E'  \  [x)E  \  a{X)E  |  a[El 

where  X  is  a  term  variable.  Let  0  abbreviate  {a)a(X)X.  The  semantics  of  the 
higher  order  x-calculus  is  defined  by  the  relevant  rules  of  the  first  order  y-calculus 
together  with  the  following  rules  incorporating  a  call-by-name  mechanism: 

E-^F 

a{X)E\a[F]  —  E[F/X]  a(X)E  --  a(X)F 

A  structural  equality  is  imposed  on  the  members  of  'H,  whose  definition  is  the 
same  as  Definition  1.  Usually  a  bisimulation  equivalence  for  a  higher  order  pro¬ 
cess  calculus  is  defined  for  closed  processes.  This  is  a  tractable  approach.  But  in 
the  presence  of  the  second  reduction  rule  given  above,  the  method  breaks  down. 
A  bisimulation  equivalence  for  higher  order  y-calculus  has  to  be  defined  on  all 
terms.  For  that  purpose,  let’s  say  that  a  binary  relation  7^  on  is  substitution 
closed  if  whenever  ElZF  then  E[Ei/Xi, . . . ,  Ei/Xi]7ZF[Ei/Xi, . . . ,  Ei/Xi]  for 
El,...,  Ei  eTi  and  Xi,...,Xi  that  are  among  the  free  variables  of  E\F. 

Definition  9.  A  substitution  closed  binary  relation  'll  on  Tils  a.  local  bisimula¬ 
tion  if  whenever  E'R.F  then  for  any  H  and  {x}  C  A/*  it  holds  that 

(i)  if  (x)(£'|iif)  A  then  F'  exists  such  that  {x)(F\H)  4^  F'  and  E’TZF'; 

(ii)  if  {x)(F\H)  A  F'  then  E'  exists  such  that  (x)(£^|iJ)  A  E'  and  E'lZF'. 

The  local  bisimilarity  is  the  largest  local  bisimulation  on  higher  order  terms. 

The  above  definition  is  given  in  terms  of  a  labeled  transition  system  on  7i  that 
is  defined  by  the  relevant  rules  in  Sect.  3.  It  should  be  remarked  that  is  by 
definition  substitution  closed. 

Theorem  10.  is  a  congruence  equivalence:  if  E  F  andG€'H  then 
(i)  a[x].E  a[x].F;  (ii)  E\G  F\G;  (in)  {x)E  [x)F; 

(iv)  a(X)E  A{X)F;  (v)  a[E]  a[F]. 

Proof.  We  only  prove  (v).  For  the  sake  of  this  proof,  let’s  define  7{o[X]  to  be  the 
set  of  all  higher  order  terms  E  such  that  each  occurrence  of  X  is  within  a[G]  for 
some  a  and  some  G  £7i.  Let  'll  be 

{{E[A/X],  E[B/X])  \  B,  E  e  'Ho[X],  X  a  variable}. 

Suppose  E[A/X]  — »■  G.  Then  G  =  F[A]  for  some  F  G  Ho[X].  It  can  be  easily 
shown  that  some  H  B  'H  exists  such  that  E[B/X\  — +  H  and  F[B/X]  H. 
It  follows  that  7^  is  a  local  bisimulation  up  to  .  Thus  q;[F^]  a[F’]  since 

a[x]eno[x].  □ 

In  the  remaining  of  the  section,  we  justify  our  claim  that  the  higher  order 
calculus  is  a  combination  of  x  ^tnd  A. 
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6.2  Recursion 

As  a  test  for  local  bisimilarity,  we  examine  Thomsen’s  recursion  ([9])  in  this 
section.  Suppose  that  E  contains  free  variable  X  and  a  does  not  occur  in  E.  The 
following  abbreviations  will  be  used: 

W]c{E)  Is'  a[a]|a(X)(a[a].£'|a[X]), 
{a)iW^iE)\a[Wi{E)]). 

We  remark  that  recX.i;  defined  here  is  slightly  different  from  Thomsen’s.  The 
idea  is  to  make  W^{E)  inert.  Before  proving  the  main  property  concerning 
rtcX.E,  we  first  establish  the  following  result. 

Lemmall.  {a){F[W%{E)IX]\a[WI,{E)])  (a)(6Ki"'|5[W'l(£^)]Wl^x(^)])- 

where  E  and  F  have  free  variable  X  and  F'  is  obtained  from  F[Wf^{E)/X]  by 
replacing  some  occurrences  ofWf^(E)  by  W^{E).  Here  a  and  b  are  fresh. 

Theorem  12.  Suppose  E  contains  free  X.  Then  recX.E  E[recX.E/X]. 

Proof  Suppose  E  and  F  contain  free  variable  X,  a  ^  n{E,  F)  and  gn(E)  fl 
ln(F)  =  0.  Using  Lemmall,  one  proves  that  ia)(F[Wx{E)/X]\a[Wx{E)])  ^ 
F[recX.E/X].  So  recX.i;  (a)(E[W^{E)/XMW^{E)])  E[TecX.E/X], 
which  is  what  we  are  after.  □ 


6.3  Projecting  Out  Guarded  Recursion 

In  this  section  we  show  that  the  higher  order  x  can  be  seen  as  an  extension 
of  the  first  order  x-  ^  fallout  of  the  result  is  a  justification  of  the  claim  that 
the  guarded  recursion  is  completely  unnecessary  in  the  higher  order  x-calculus. 
Let  x'*’  be  the  higher  order  x-calculus  enriched  with  the  guarded  recursion.  The 
language  x"*”  can  be  investigated  along  the  same  line  as  the  higher  order  x  has 
been.  and  are  defined  accordingly.  It  can  also  be  shown  that  is  a 
congruence  relation.  The  definition  of  a  structural  translation  from  x'^-terms 
to  x^^ -terms  is  nontrivial  only  on  guarded  recursion: 

a(;j;£ll'(a)((x)a[x].(g|a(X)(X|a[X]))|a[(x)a[x].(£|a(X)(Xla[X]))]). 

The  translation projects  the  guarded  recursion  out,  as  it  were. 

Theorem  13.  For  P  G  P  P- 

Theorem  14.  (i)  Suppose  P  and  Q  are  in  'H.  Then  P  Q  iff  P  Q- 
(ii)  Suppose  P  and  Q  are  in  Then  P  Q  iffP^^  Q. 

(Hi)  (a)  if  P  ^  P'  (P  ^  P')  then  P  4  P"  (P  P" )  such  that  P"  ?; 

(1)  if  pL  p"  (p  _  p")  then  P  —  P'(P  —  P')  such  that  P"  si"  P''. 
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Proof,  (i)  Suppose  P,  Q  are  in  Ti.  P  »+  Q^clearly  implies  P  Q.  Suppose 
P  ss"  Q.  Then  (x)(P|;j)  =  {x){P\R)  and  (x)((3|fl)  =  (x)((3|^),  wh«e  R  G  'H+. 
By  theorem  13,  (x)(P|ii)  «+  ix)(P\R)  and  (x)(g|ii)  »+  (x)(g|A).  It  is  now 
easy  to  see  that  ss*"  is  a  local  bisimula^ion  up  „ 

(ii)  By  theorem  13,  P  »+  Q  iff  P  «+  Q.  By  (i)  P  «+  Q  iff  P  a"  □ 

As  x'*’  extends  the  first  order  x,  so  does  the  higher  order  x-calculus  in  view 
of  Theorem  13  and  Theorem  14. 

6.4  Full  Integration 

An  integration  of  x  with  the  full  A  is  the  higher  order  calculus  extended  with 

F 

a[E]-^a[Fy 

The  operational  semantics  of  the  full  A-calculus  can  be  simulated  in  the  fully 
integrated  calculus.  The  encoding  is  the  following: 

'^=  (i)(z;)(a[u].u[x]|x(X)IM|„) 

IMNJu  — ^  (x)(u)(lAf]t,|u[u].u[j;]|^(t£))(i[u;]|[iV]„)]). 

Theorem  15.  Suppose  M  is  a  X-ierm.  If  M  —>■  N  then  |M|u  — |Ar|u, 

Definition  9  now  gives  rise  to  an  equivalence  relation  on  the  set  of  all  terms 
of  the  fully  integrated  calculus.  The  results  in  Sect.  6.2  and  Sect.  6.3  also  hold 
for  this  language.  The  (i)  through  (iv)  of  Theorem  10  also  hold.  But  so  far  we 
haven’t  been  able  to  prove  the  (v)  of  Theorem  10  for  the  fully  integrated  calculus. 

7  Remark  on  Pragmatics 

In  the  formulation  of  x-calculus,  we  use  the  same  set  of  names  for  both  global 
and  local  names.  But  conceptually  the  identification  is  not  always  helpful.  The 
standard  bisimilarity  ([6])  for  the  ^-processes  is  not  closed  under  input  prefixing 
operation.  This  is  because  the  variable  names  and  the  free  names  are  regarded 
as  semantically  different  in  this  approach.  Sangiorgi’s  open  bisimilarity  is  con¬ 
gruent.  But  in  that  approach  local  names  are  treated  differently.  In  x-calculus, 
both  local  and  global  names  are  variable  names,  which  is  what  local  bisimilar¬ 
ity  assumes.  The  situation  is  similar  to  that  in  A-calculus,  where  both  free  and 
closed  variables  are,  well,  variables  that  can  be  instantiated  by  any  A- terms. 

But  variable  names  alone  do  not  suffice  in  practice.  This  is  clear  from  the 
mobile  process  interpretation  of  object  oriented  languages  ([10]).  The  usual  prac¬ 
tice  is  to  postulate  that  consists  of  two  parts:  a  set  J\fv  of  variable  names  and 
a  set  Ac  of  constant  names.  We  can  now  define  a  x-process  to  be  a  x-term  in 
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which  all  variable  names  are  localized.  So  in  x-processes  there  are  two  kinds  of 
local  names:  local  variable  names  and  local  constant  names.  A  communication 
either  identifies  two  local  variable  names  or  replaces  a  local  variable  name  by  a 
local  or  global  constant  name.  A  communication  between  two  constant  names  is 

prohibited.  Let  /?  range  over  «  G  G  ^^c  u!^}. 

Definition  16.  Let  7^  be  a  binary  relation  on  the  set  of  x-processes.  is  a 
simulation  if  PHQ  implies  that  if  P  P'  then  there  exists  some  Q'  such  that 

Q  i>  Q’  and  P"JIQ' .  The  relation  7^  is  a  bisimulation  if  both  H  and  its  reverse 
are  simulations.  The  bisimilarity  is  the  largest  bisimulation. 

The  TT-calculus  can  be  reexamined  in  this  new  setting.  The  input  prefix  op¬ 
eration  restricts  variable  names  whereas  the  localization  operation  always  re¬ 
stricts  constant  names.  7r-processes  are  now  defined  to  be  those  processes  in 
which  all  variable  names  are  restricted  by  input  prefixes.  Let  7  range  over 

r  ca  ca  c(a),  ^  ir  1 

^|a,cGAc}. 

Definition  17.  Let  7^  be  a  binary  relation  on  the  set  of  7r-processes.  71  is  a 
simulation  if  P7ZQ  implies  that  if  P  ^  P'  then  there  exists  some  Q'  such  that 
Q  ^  Q'  and  P'7^Q^  The  relation  72.  is  a  bisimulation  if  both  7^  and  its  inverse 
are  simulations.  The  bisimilarity  is  the  largest  bisimulation. 

The  translation  given  in  Sect.  4  works  in  this  practical  setting.  It  establishes  an 
operational  correspondence  in  the  sense  of  Theorem  7.  In  addition  one  has 

Theorem  18.  For  Tr-processes  P  and  Q,  P  Q  if  and  only  if  P°  Q°. 

So  practically  speaking,  tt  is  a  subcalculus  of  x- 
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Abstract.  Very  recently,  the  second  author  showed  that  the  question 
whether  an  equation  over  a  trace  monoid  has  a  solution  or  not  is  decid¬ 
able  [11,12].  In  the  original  proof  this  question  is  reduced  to  the  solv¬ 
ability  of  word  equations  with  constraints,  by  induction  on  the  size  of 
the  commutation  relation.  In  the  present  paper  we  give  another  proof  of 
this  result  using  lexicographical  normal  forms.  Our  method  is  a  direct 
reduction  of  a  trace  equation  system  to  a  word  equation  system  with 
regular  constraints,  using  a  new  result  on  lexicographical  normal  forms. 


1  Introduction 

Solving  equations  is  a  central  topic  in  various  fields  of  computer  science,  es¬ 
pecially  concerning  unification,  as  required  by  automated  theorem  proving  or 
logic  programming.  A  celebrated  result  of  Makanin  [10]  states  that  the  question 
whether  an  equation  over  words  has  a  solution  or  not  is  decidable:  There  ex¬ 
ists  an  algorithm  deciding  for  a  given  equation  L  =  R,  where  L,  R  e  {O  U  S)* 
contain  both  unknowns  from  J?  and  constants  from  X",  whether  an  assignment 
^  U*  exists,  satisfying  a{L)  =  a{R).  Slightly  more  general,  the  existen¬ 
tial  theory  of  equations  over  free  monoids  is  decidable,  i.e.,  given  an  existentially 
quantified,  closed  first-order  formula  S  over  atomic  predicates  of  the  form  L  =  R 
and  L  /  R,  it  is  decidable  whether  S  is  valid  over  a  given  free  monoid.  Moreover, 
adding  regular  constraints,  i.e.,  atomic  predicates  of  the  form  x  e  C,  where  C  is 
a  regular  language,  preserves  decidability  [14]. 

In  this  paper  we  prove  the  generalization  of  Makanin’s  result  to  trace  monoids, 
which  were  originally  studied  in  combinatorics  [4],  They  became  meaningful 
for  computer  science  in  concurrency  theory,  where  they  were  introduced  by 
Mazurkiewicz  [13]  in  connection  with  the  semantics  of  labelled  Petri  nets.  For 
an  overview  of  trace  theory  and  related  topics  see  “The  Book  of  Traces”  [7]. 
Most  results  obtained  so  far  in  the  area  of  equations  on  traces  were  restricted 
to  equations  without  constants,  see  [8,5].  The  decidability  of  the  solvability  of 
equations  with  constants  was  stated  as  an  important  open  question. 


*  This  work  was  done  during  a  stay  at  the  University  of  Stuttgart. 


337 


2  Notations,  Preliminaries  and  Lexicographical  Normal 
Forms 

An  independence  alphabet  is  a  pair  {U,!),  where  i7  is  a  finite  alphabet  and 
I  C  U  X  U  is  an  irreflexive  and  symmetric  relation,  called  independence  relation. 
With  a  given  independence  alphabet  (A’,  I)  we  associate  the  trace  monoidM{U,  I) 
This  is  the  quotient  monoid  E*/=i,  where  =/  denotes  the  congruence  being  the 
equivalence  relation  generated  by  the  set  {uahv  —  ubav  \  (a,  6)  G  /,  6  T’*}; 

an  element  t  eM{U,  I)  is  called  a  trace,  the  length  \t\  of  a  trace  t  is  given  by  the 
length  of  any  representing  word.  By  alph(t)  we  denote  the  alphabet  of  a  trace  t, 
being  the  set  of  letters  occurring  in  t. 

By  1  we  denote  both  the  empty  word  and  the  empty  trace.  Words  v,w  G  E* 
are  called  independent  (w.r.t.  I),  if  alph(u)  x  alph(zi;)  C  I.  In  this  case  we 
simply  write  {v,w)  G  /  or  u  G  -^(w)  where  I{w)  for  w  E  E*  is  a  shorthand  for 
{a  e  E  \  {a}  X  alph(u;)  C  /}. 

The  initial  alphabet  of  w  e  E*  is  the  set  init(ii;)  =  {a  e  E  \  3w',w"  G 
E*  with  IV  =j  w'  and  w'  =  aw”}. 

A  word  language  L  C  17*  is  called  I-closed  if  whenever  v  e  L  and  w  =i  v  then 
we  have  w  £  L. 

Throughout  the  paper  we  will  suppose  that  {E,I)  denotes  an  independence 
alphabet,  where  E  has  the  cardinality  n  >  1.  We  suppose  that  E  is  totally 
ordered  by  <  and  we  identify  E  with  the  set  {!,..., n}.  The  order  on  E  is 
extended  to  the  lexicographical  order  on  E* . 

A  word  u  G  i7*  is  in  lexicographical  normal  form  (w.r.t.  I  and  <)  if  u  <  ic  holds 
for  all  w  such  that  v  w.  Let  LNF  denote  the  set  of  lexicographical  normal 
forms,  i.e.,  LNF  C  E*  is  the  set  of  minimal  representatives  for  M.{E,I).  For 
V  £  E*  we  denote  by  lex(t;)  the  unique  word  w  £  LNF  such  that  w  =j  v.  We 
view  lex  as  a  mapping  lex  :  E*  LNF, 

There  is  a  simple  characterization  of  lexicographical  normal  forms  due  to  Anisi¬ 
mov  and  Knuth: 

Proposition  1  ([3]).  Let  E  be  totally  ordered  by  <.  Then  a  word  v  £  E*  is  in 
lexicographical  normal  form  (w.r.t.  I,  <)  if  and  only  for  every  factor  aub  of  v 
with  a,b  £  E ,  u  £  E*  and  {au,  b)  £  I  we  have  a  <  b. 

Definition  2.  Let  E  be  totally  ordered  by  <.  For  0  A  C  17  let  the  height 
h{A)  be  h{A)  =  max{a  |  a  G  A).  Let  also  /i(0)  =  0.  (Thus,  h{A)  G  {0, . . . ,  n}.) 
The  height  h{v)  of  a  word  u  G  i7*  is  defined  as  h(v)  —  fi(alph(u)). 

Remark  3.  Let  m  >  1  and  s,t,v,si, . . . ,  Sm,  •  •  • ,  €  E*  be  words  satisfying 

the  following  conditions: 

S  =  Si  •  '  '  Sjyi  , 

t  —J  tl  •  •  •  tjyi  , 

V  —  Siti  *  •  •  Srntfn  , 

tj  G  I{sj+i  ■  •  •  Sm)  for  all  1  <  j  <  m  . 

Then  we  have  st  =/  v. 
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The  previous  remark  is  clear  and  its  converse  will  be  stated  for  lexicographical 
normal  forms  in  the  Main  Lemma  below.  It  is  the  crucial  correctness  argument 
for  our  reduction  from  trace  equations  to  word  equations.  The  important  point 
is  that  the  value  of  m  (given  below)  can  be  bounded  as  a  function  in  the  size  of 
the  alphabet,  and  that  the  height  decreases. 

Lemma  4  (Main  Lemma).  Let  s,t,v  E  LNF  be  words  in  lexicographical  nor¬ 
mal  form  such  that  st  =i  v. 

Let  h  =  h(s)  denote  the  height  of  s  and  suppose  h>  0. 

Then  there  exist  an  integer  m,  1  <  m  <  ^).  -|_  and  words  Si, . .  ^ ,  Sm, 

, . . . ,  tjn  E  LNF  in  lexicographical  normal  form  such  that  the  following  condi¬ 
tions  hold: 

S  =  Sj  *  •  •  Sffi  , 
t  =1  ti  '  '  ’  tjTn  5 
V  —  Siti  ’  •  *  Smtm  ) 

Sj  7^  1,  for  all  1  <  i  <  m , 
tj  7^  1  for  all  I  <  j  <  m  ^ 
tj  E  I{sj+1  ■  •  •  Sm)  for  all  1  <j  <m, 
h{tj)  <  h  for  all  1  <  j  <m. 

Remark  5.  Before  giving  the  proof  of  the  Main  Lemma,  let  us  note  that  the 
trace  equality  st  =i  v  above  cannot  be  replaced  by  word  equalities  of  type 
s  =  Si  -  •  ■  t  ^  ti-  -  tm,  V  siti  ’  •  •  Smtm^  For  example,  consider  M(Z',  /)  = 
{a,  b,  cy/{ab  =  ba,  be  =  cb}  and  s  =  c,  t  ab.  Then  the  lexicographical  normal 
form  of  st  is  r!  =  bca. 

Proof  of  the  Main  Lemma.  We  have  st  =/  v  with  s,t,v  E  LNF  and  h  ~  h{s)  >  0. 
Consider  the  decomposition  of  u,  u  —  siti  ■  •  •  Smtm,  where  m  >  1  is  minimal  such 
that  s  =/  Si  •  •  ■  s„^,  t  =/  ti  •  ■  •  and  tj  E  /(s^+i  ■  ■  ■  Sm)  for  all  i,  1  <  j  <  m. 
Clearly,  since  m  is  minimal,  we  have  s^  7^  1  and  7^  1  for  all  1  <  i  <  m, 
1  <  j  <  ^-  Moreover,  the  words  Si,tj  are  in  lexicographical  normal  form. 

Let  us  first  show  that  s  -  si  •  •  •  Sn^.  Assume  aub  is  a  factor  of  si  •  •  •  s^  with 
a^b  E  TJ,  u  E  Li*  and  b  E  I{au).  If  aub  is  a  factor  of  some  Sj,  then  a  <b  follows 
by  Prop.  1  and  we  are  done.  Otherwise  let  i  <  j  he  such  that  Si  E  S*au' , 
Sj  E  u"bE*  and  u  =  u'sj+i  ■••Sj_in".  Since  tk  E  I{sj)  for  k  <  j  we  obtain 
b  E  I{au'si+iti+i  ■  •  ■  Sj_itj_iu"),  hence  a  <  b  due  to  v  being  in  lexicographical 
normal  form.  Thus  Si  ■  •  •  s^  is  in  lexicographical  normal  form,  again  by  Prop.  1, 
and  it  follows  that  s  =  si  •  •  -  s^. 

Suppose  that  1  <  j  <  m  and  let  b  denote  the  first  letter  of  Sj+i.  Let  a  E  alph(tj), 
i.e.  tj  =  uau'  for  some  words  u,u'.  Then  au'b  is  a  factor  of  u  E  LNF  satisfying 
b  E  I{au'),  thus  we  have  a<b.  Therefore  h{tj)  <  h{b)  <  h  for  every  1  <  j  <  m. 
Finally,  assume  by  contradiction  that  m  >  {n  —  l)(/i  —  l)/2  +  1.  Let  bi,aj 
denote  the  first  letter  of  Sj,  tj  respectively,  l<i<m^l<j<m.  Consider 
the  chain  of  alphabets  /(s2-’-Sm)  Q  I{s3  •  "  Sm)  Q  ’’’  Q  H^m)-  Note  that 
we  have  /(s2  ■  •  ■  0  due  to  ti  7^  1,  and  also  I{sm)  /  LJ  due  to  Sm  /  1- 

Therefore  by  the  pigeon-hole  principle  there  exist  some  indices  1  <  <  m  with 
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j  —  i  >  {h  —  l)/2  satisfying  /(si+i  *  •  •  Sm)  =  H^j+i ' ' '  ^rn)-  Consider  the  factor 
tiSi-^iU+i  --'tjSj+i  of  V.  Note  that  (tk.si)  G  I  holds  for  every  k,  I  such  that 
i  <  kj  -  1  <  j,  since  4  €  Hsk+i  -  •  Sm)  =  I{si+i  Therefore,  v  £  LNF 

implies  at  <  bi+i  <  <  ■  •  •  <  <  bj+i  and  we  obtain  h{s)  >  h{bj+i)  > 

2{j  -  z  4- 1)  >  h,  a  contradiction. 

3  Trace  Equation  Systems 

Definition  6.  Let  i?  denote  a  finite  set  of  unknowns  with  T  n  1?  0. 

i)  A  word  equation  over  L  and  Q  has  the  form  L  —  with  L,R  £  {S  U  O)* . 

ii)  An  assignment  for  an  equation  over  E  and  J?  is  a  mapping  a:  Q  4-  E* 

being  extended  in  a  natural  way  to  a  homomorphism  a:  {E  U  f})*  T*,  by 

a\jj  =  id^;. 

A  solution  for  the  equation  L  =  i?  is  an  assignment  a  satisfying  the  equality 
cr(L)  —  (j{R)  in  E*. 

Makanin  [10]  showed  in  1977  that  the  question  whether  a  word  equation  has 
a  solution  or  not  is  decidable.  Moreover,  the  solvability  of  a  system  of  word 
equations  can  be  reduced  by  well-known  techniques  to  the  solvability  of  a  single 
equation.  The  problem  can  also  be  generalized  by  introducing  regular  constraints 
for  the  unknowns,  i.e.  regular  sets  Cx  C  E*  for  x  £  Q.  Here,  a  solution  a 
for  an  equation  is  required  to  satisfy  (r(x)  £  Cx  for  all  x.  It  has  been  shown 
by  Schulz  [14]  that  the  solvability  of  word  equations  with  regular  constraints 
remains  decidable.  We  are  going  to  show  that  this  more  general  result  generalizes 
to  traces. 

Definition  7.  Let  (T,/)  denote  an  independence  alphabet  and  Q  a  finite  set 
of  unknowns,  T  fl  J?  =  0. 

i)  A  trace  equation  over  {E,I)  and  17  has  the  form  L  =  R,  with  L,R  £ 
{Euf2)\ 

A  solution  for  the  equation  L  =  is  an  assignment  a:  f2  ^  E*  satisfying 
a{L)  =/  a{R). 

ii)  A  system  of  trace  equations  is  a  formula  built  with  the  connectives  and  (&), 
or  (V),  not  (^)  over  atomic  predicates  of  the  form  L  =  R  (trace  equation) 
and  X  £  C  (constraint),  where  C  C  E*  denotes  an  /-closed  regular  language. 
A  solution  for  a  system  S  over  (T,/),  17  is  an  assignment  a:  Q  E*  such 
that  S  evaluates  to  true  when  the  atomic  predicates  L  =  R,  x  £  C  are 
replaced  by  the  truth  value  of  cr{L)  =j  (j{R),  <7{x)  £  C,  respectively. 

Remark  8.  Later  we  will  deal  simultaneously  with  trace  and  word  equations,  so 
we  distinguish  notationally  between  L  —  for  a  word  equation,  whereas  L  =  R 
denotes  a  trace  equation.  The  difiPerence  is  that  equality  under  an  assignment  is 
interpreted  in  the  free  monoid  E*,  resp.  in  the  trace  monoid  M(T,/). 
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Remark  9.  A  system  of  word  equations  (with  regular  constraints)  is  just  a  special 
case  of  Def.  7  where  one  takes  7  =  0.  Since  negations  can  be  eliminated  (see  also 
3.1),  we  note  that  the  question  whether  a  system  of  word  equations  has  a  solution 
or  not  is  decidable. 

Remark  10.  Adding  arbitrary  (i.e.,  not  /-closed)  regular  constraints  to  a  system 
of  trace  equations  makes  the  question  of  solvability  undecidable.  This  is  due  to 
the  fact  that  the  solvability  of  the  equation  x  =  y  with  x  £  C ^  y  £  C'  is  equivalent 
to  the  non-emptiness  of  the  intersection  {w  £  S*  \  w  =i  v  for  some  v  £  C}  £) 
{w  £  E*  \  w  u  for  some  u  £  C).  For  regular  languages  C,C’  this  last 
question  is  known  to  be  undecidable,  see  [1]. 

Remark  11.  Similar  to  the  word  case,  the  solvability  of  a  trace  equations  system 
could  be  reduced  to  the  solvability  of  a  single  trace  equation  (with  additional 
constraints).  However,  this  would  be  of  no  use  here. 

The  aim  of  this  section  is  to  reduce  the  solvability  problem  for  trace  equations 
to  word  equations  with  regular  constraints.  We  will  give  a  direct  proof  using 
lexicographical  normal  forms  to  show  the  following 

Theorem  12  ([11,12]).  Let  S  be  a  trace  equation  system  over  {E,I)  and  ft. 
Then  a  set  ft'  Q  O  of  unknowns  and  a  system  of  word  equations  S'  over  E,  ft' 
can  be  effectively  constructed,  such  that  S  is  solvable  if  and  only  if  S'  is  solvable. 

Corollary  13.  It  is  decidable  whether  a  system  of  trace  equations  has  a  solution. 


3.1  Basic  Reductions 

For  a  given  trace  equation  system  S  we  first  eliminate  constants  by  introducing 
new  unknowns  Xa  and  constraints  Xa  £  {a},  for  a£  E.  Then  we  replace  a  by  Xa 
in  each  equation  L  =  RoiS.  Hence,  without  loss  of  generality  atomic  predicates 
are  of  the  form  L  =  R,  where  L,R  £  ft*. 

Furthermore,  we  may  assume  that  the  given  system  is  written  in  disjunctive 
normal  form.  Then  we  replace  every  negation  not(L  =  R)  by  the  disjunction  of 
formulas  of  the  type 

L  =  xy  &  R  =  xz  &  init(?/)  =  A  h  init(2:)  =  A'  (1) 

where  x,y,z  denote  new  unknowns  and  the  disjunction  is  taken  over  all  alphabets 
A,A'  £  E  such  that  A  fl  A'  -  0  and  A  U  A'  7^  0.  Clearly,  constraints  of  the  form 
init(.T)  =  A  or  alph(a;)  =  A,  A  C  E,  can  be  expressed  by  /-closed  regular 
languages. 

Since  the  set  of  /-closed  regular  languages  forms  an  effective  boolean  algebra  (as 
the  family  of  recognizable  subsets  of  a  monoid  [9])  we  may  also  suppose  that  the 
formula  contains  no  negated  constraints,  i.e.  no  formula  of  type  not  (a;  £  C). 
Moreover,  it  suffices  to  consider  trace  equations  of  the  form  Xi’--Xk  =yi--yi 
with  k  >  I  >  0,  Xi,yj  £  ft.  (The  equation  Xi--Xk  =  1  and  the  occurrences  of 
each  Xi  can  be  deleted  from  all  equations,  adding  the  constraints  alph(a:i)  =  0.) 
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3.2  From  Traces  to  Words 

The  main  idea  for  reducing  trace  equations  to  word  equations  will  consist  in 
replacing  a  trace  equation  L  =  Rhy  some  word  equations  Li  =  Ri, . . .  ,Lk  =  Rk 
with  additional  constraints  and  unknowns.  Moreover,  for  every  solution  a  for 
L  =  R  the  mapping  lex  o  a:  f}  U*  — >  LNF  can  be  extended  to  a  solution 
for  the  equations  h  =  Ri, Lk  -  Rk-  Vice  versa,  each  solution  for  the  new 
equations  will  also  be  a  solution  for  L  =  i?  when  restricted  to  its  unknowns. 
This  reduction  actually  goes  by  a  chain  of  intermediate  trace  equations.  By 
choosing  an  appropriate  ordering  we  will  show  that  the  reduction  process  termi¬ 
nates  yielding  a  system  of  word  equations  (with  constraints). 

We  will  consider  in  the  following  formulas  5{T,  W,  C)  in  disjunctive  normal  form 
with  atomic  predicates  from  some  finite  sets  T,  W,  C,  containing  no  negations. 
T  will  denote  a  set  of  trace  equations,  W  a  set  of  word  equations  and  C  =  {x  ^ 
Cx  I  a;  e  17}  a  set  of  constraints,  where  each  Cx  is  an  /-closed  regular  language. 
Moreover,  every  L  =  R  in  T  has  the  form  xi  ■  ■  -  Xk  =  yi  •  ■  -  yi  with  k  >  I  >  1, 
^i,yj  e  17.  A  solution  for  S{T,  W,  C)  is  an  assignment  cr:  17  ->  X”*  which  makes 
the  formula  evaluate  to  true  when  (L  =  R)  from  T,  (L  =  R)  from  W  and 
X  £  Cx  from  C  are  replaced  by  the  truth  value  of  a{L)  =j  (7{R),  a{L)  —  cr{R), 
and  a{x)  G  Cx^  respectively. 

Definition  14.  A  formula  5(T,  W,  (7)  as  above  is  called  normalized  if  for  every 
solution  a  for  S  the  mapping  lex  o  rr  is  a  solution  for  5,  too. 

Remark  15.  Note  that  a  formula  S{T,  0,  C)  with  /-closed  constraints  C  is  always 
normalized. 

Remark  16.  Suppose  S  =  S{T,  W,  C)  is  normalized  and  let  x  =  y  belong  to  T, 
where  x,y  e  17.  Consider  the  new  formula  S'  =  S'{T' ,W' ,C)  obtained  from  S 
by  replacing  every  occurrence  of  x  =  y  by  x  =  y  and  letting  T'  ^  T  \  {x  =  y}^ 
W'  =  W  U  {x  ~  y}.  Then  S  is  solvable  if  and  only  if  S'  is  solvable.  Note  that 
a  solution  for  S'  is  a  solution  for  5,  too.  However,  the  converse  is  true  only 
because  5  is  a  normalized  system.  Without  this  assumption  about  S  it  cannot 
be  guaranteed  that  every  solution  for  S  also  solves  S' ,  see  the  example  below. 
Moreover,  S'  is  a  normalized  system,  too. 

Example  17.  Consider  the  trace  equation  system  S  =  ({a:  =  y},{x  =  Q'b,y  = 
6a},  0)  given  as  the  conjunction  {x  =  y)  Sz:  {x  =  ah)  Sz  {y  =  ba),  where  (a,  6)  G  I. 
Then  5  is  not  normalized,  but  of  course  it  has  a  solution.  However,  replacing 
X  =  y  by  the  word  equation  x  =  y  yields  a  system  with  no  solution. 

Proof  of  Thm.  12.  Recall  that  an  equation  system  with  /-closed  constraints 
5  -  5(T,  0,  {x  G  Cx}xeQ)  over  (T’,/),  17  is  a  normalized  system.  As  previously 
noted  it  suffices  to  consider  a  formula  S  with  trace  equations  of  the  form 

xi---xk=y\"-yu  k>l>l,  (/c,/)  7^  (1,1) .  (2) 

We  suppose  without  loss  of  generality  that  for  all  unknowns  a:  G  17  some  Ax  ^  E 
exists  such  that  h{Ax)  >  0,  and  x  eCx  implies  alph(a:)  C  Ax,  for  all  x.  Moreover, 
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let  5  be  a  conjunction  of  trace  equations  as  in  (2),  of  word  equations  and  of  I- 
closed  regular  constraints  x  £  Cx> 

We  define  the  weight  of  a  trace  equation  xi  ■  ■ '  Xk  =  yi  ’  ■  •  Vi  in  (2)  as  the 
triple  of  natural  numbers  Axi),k)  and  we  consider  the  lexicographical 

ordering  on  N  x  N  x  N.  We  will  show  in  the  following  that  every  such  trace 
equation  can  be  replaced  by  a  formula  over  word  equations  and  trace  equations 
of  lower  weight,  together  with  some  additional  constraints.  Concretely,  we  apply 
the  following  rules. 

Ride  1:  Suppose  I  >  1  and  let  2:  denote  a  new  unknown.  Then  we  replace  the 
equation  xi  •  •  ■  =  ?/i  •  •  •  yz  by 

Xi-’Xk  =  z  &  z  k  alph(2:)  C  • 

Rule  2:  Suppose  I  =  1  and  k  >  2,  and  let  2:  denote  a  new  unknown.  Then  we 
replace  the  equation  xi  ■  -  Xk  =  yi  hy 

xiz  =  yi  k  X2  •  ■  -  Xk  =  z  k  alph(2^)  C  • 

Rule  3:  Suppose  I  =  1  and  k  =  2  and,  in  order  to  simplify  notation,  consider  the 
equation  xy  =- z  (rather  than  uniformly  a:ia;2  —  yi)-  Moreover,  let  h  =  h{Ax) 
denote  the  height  of  A^  (where  alph(x)  C  A^  follows  from  the  constraint  x  eCx)- 
We  replace  ^  by  the  disjunction  of  the  word  equation 

xy  =  z  (3) 


and  of  formulas  of  the  type 

x^xi--xm  k  y  =  yi-“ym  k  z  =  xiyi  •  ■  ■  Xmym  k 

alph(a:i)  C  k  •••  k  alph(a:m)  Q  k 

alph(yi)  C  Bi  k  •  • '  k  aXph.{ym)  Q  Bm  ■>  (4) 

where  Xi^yj  are  new  unknowns  and  the  disjunction  is  taken  over  all  values  of 
m  such  that  1  <  m  <  (n  -  l){h  -  l)/2  +  1  and  over  all  alphabets  Ai  , . . . ,  Afji^ 
j5i  , . . . ,  Bryi  C  E  such  that^ 

Aii^^  for  all  1  <i  <m,  and 
1  <  h{Bj)  <  h  for  all  1  <  i  <  m,  and 
Bj  X  Ai  C  I  for  all  1  <  i  <  ^  <  and 

Ai  U  ■  ■  *  U  Ajji  C  Axj  and  B\  U  ■  *  •  U  B^n  C  Ay  .  (5) 

The  word  equation  xy  =  z  in  (3)  corresponds  to  the  case  m  =  1  in  (4)  (this  is 

in  particular  the  case  when  h  =  1  in  (5)).  It  is  actually  the  main  case  where  the 

number  of  trace  equations  in  S  decreases. 

Let  S'  denote  the  formula  obtained  from  S  by  applying  one  of  the  three  rules 
described  above.  Note  that  none  of  the  rules  adds  negations. 


^  Obviously  some  equations  become  redundant  and  they  can  be  actually  omitted  in 
the  disjunction. 
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Lemma  18.  Let  S  be  a  normalized  equation  system.  Then  the  new  system  5'  is 
normalized,  too.  Moreover,  S'  is  solvable  if  and  only  if  S  is  solvable. 

Proof.  The  claim  is  easily  seen  for  the  first  two  rules  above,  since  there  is  a 
natural  bijection  between  the  set  of  solutions  of  S  and  of  S',  respectively. 
Clearly,  if  S'  has  been  obtained  from  S  by  the  third  rule,  then  every  solution 
for  S'  is  a  solution  for  5,  too,  see  Rem.  3.  Therefore,  let  us  consider  an  equation 
xy  =  z  in  S  and  a  solution  a:  f7  E*  for  S.  Then  a'  =  \exoa  is  also  solution 
for  S,  since  S  is  normalized.  We  show  that  a'  can  be  extended  to  a  solution  for 
S'.  Let  5  =  cr'{x),  t  =  a'{y)  and  v  =  (7'{z).  Hence,  st  =/  v  with  s,t,v  €  LNF.  If 
h{s)  =  1,  then  in  the  Main  Lemma  we  have  m  =  1,  hence  v  =  st.  Therefore  a' 
is  a  solution  of  the  new  system  S'. 

Suppose  that  st  =j  v  with  s,t,v  G  LNF,  h{s)  =  h  >  1.  Then  some  m,  1  < 
m  <  (n  -  l){h  -  l)/2  +  1,  and  words  si, . . . ,  Sm,  h, . . .  ,tm  exist,  satisfying  the 
conditions  of  the  Main  Lemma.  With  o-'{xi)  =  Si,  cr'{yj)  =  tj  it  is  easily  verified 
that  a'  is  a  solution  for  S' . 

The  relation  between  the  solution  set  of  S  and  the  solution  set  of  S' ,  together 
with  the  fact  that  S  is  normalized,  imply  that  S'  is  normalized,  too.  This  shows 
the  lemma. 

Finally,  note  that  the  new  trace  equation  7/1  •  •  •  =  ?/  in  (4)  has  lower  weight 

than  xy  =  z  due  to  <  h  =  h{Ax),  Hence  the  reduction  rules  establish 

a  noetherian  rewriting  system  on  trace  equation  systems.  Applying  the  rules  as 
long  as  possible  we  end  with  a  system  of  word  equations  S'  =  {^,W',C').  This 
concludes  our  proof. 

4  Computing  Lexicographical  Normal  Forms 

The  aim  of  this  section  is  to  give  a  formula  for  computing  the  product  of  lexi¬ 
cographical  normal  forms.  This  yields  an  alternative  proof  of  Thm.  12  and  the 
so  far  best  known  upper  bound  on  the  number  of  new  unknowns  needed  for  the 
reduction.  We  conclude  the  section  with  two  remarks  concerning  the  parallel 
complexity  of  computing  lexicographical  normal  forms. 

Definition  19.  Let  ~/  be  a  relation  on  (M*)*  defined  as 

(xi ,  .  .  ■  ,  (^1 1  •  •  •  ■>  ^m' ) 

if  m  =  m'  and  there  exists  some  i,  1  <i  <  m  such  that 

Xj  —  x'j  for  all  1  <  j  <  m,  j  ^  {i,i  P  1},  and 
(xj,Xi-j-i)  (x^_j_2,Xj)  and  {xi,Xi.\.\)  G  /■ 

By  we  denote  the  equivalence  relation  generated  on  (i7*)*  by 

Let  x  G  by  abuse  of  language  we  write  (xi, . . .  ,Xm)  ~/  ^  if  some  words 
x'l,. . . ,  x'.^  exist  such  that 

(^Xi ,  .  •  .  ,  (^1 3  •  ■  •  ?  ^m)  ^1^4  X  Xj^  *  •  •  3^,^  . 
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Theorem  20.  Let  s,t,v  G  LNF  be  words  in  lexicographical  normal  form  such 
that  st  ~i  V. 

Then  there  exist  positive  integers  m^p  with  m  <  — h  1,  p  <  n^n!  such  that 

S  —  Sj  •  •  •  Sfji  , 
t  —  t1  '  •  '  tp  ^ 

(si  ,  .  .  .  ,  Sm  J  5  •  •  •  5  ^  5 

for  some  words  si , . . . ,  , . . . ,  ip  G  S* . 

Proof.  Let  h  =  h(s)  denote  the  height  of  s.  Let  m{h),p{h)  denote  the  minimal 
integers  such  that 

^  ~  *  ■  *  ^m{h)  ) 

t  —  ti  •  •  '  tp^fi^  , 

,  ij ,  .  .  .  ,  l'p(h) )  ^  5 

for  some  words  Si^tj.  Note  that  <  |?;|.  For  h  =  0  we  have  s  =  1,  thus 

m(0)  =  p(0)  =  1,  which  satisfies  the  theorem. 

For  h  >  1  we  will  show  by  induction  on  h  that  m{h)  <  (n  —  l){h  —  l)/2  +  1  and 
p{h)  <  n^h\,  thereby  proving  the  theorem. 

Let  h  >  1.  By  the  Main  Lemma  there  exist  an  integer  m  <  (n  -  l){h  —  1) /2  P  1 
and  words  Si, . . . , Stti,  ii , .  •  • , im  in  lexicographical  normal  form  satisfying 

S  —  Si  •  '  ‘  Sjji  , 
t  —I  tl  '  '  ’  tjjl  , 

V  =■  Siti  •  •  •  Smtm  ? 

Si  ^  I,  tj  ^  I  for  1  <  z  <  m,  1  <  j  <  m , 
tj  G  lisj+i  •  • '  Srn)  and  h{tj)  <  h  for  1  <  j  <  m  .  (6) 

If  h  =  1,  then  m  =  1  in  (6),  so  we  can  take  m{h)  =  p{h)  —  1,  since  i  ii  G  LNF, 
which  satisfies  the  claim.  Hence  let  h,m>2. 

Let  ii  ~  tl  and  U  =  \ex{ti-iti)  for  i  =  2, ...  ,m.  Clearly,  im  =  L  h{ti)  <  h  for 
1  <  i  <  m  and 

ii-iU  =i  ti,  for  1  <  i  <  m.  (7) 

Now  we  can  apply  the  induction  hypothesis  to  each  of  the  (m  —  1)  equivalences 
(7)  obtaining 

(8) 

for  somep  <  (m  —  l)[m(/i  —  1)  +p{h  —  l)],  some  words  . . . ,  and  some  integers 
l  =  p+1  such  that 

ti  —  t'i._^  •  •  •  t[._i  for  every  1  <  i  <  m  .  (9) 

The  above  claim  can  be  verified  by  noting  that 

t~i  and  w/  ivi,...,Vk) 
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implies  that 

t  ~/  (tj , . . . ,  tj—i ,  5  •  •  • )  ?  ^j+i  5  •  •  •  5  ^9)  ’ 

for  some  I  j  —  i  +  k  and  v[, . . .  ,vl  G  S* ,  such  that  Vi  ‘  ■  'v'l  —  vi  ' ' '  Vk  ^^id 
each  Vq  is  a  factor  of  some  Vr.  Hence,  we  obtain  from  (8),  (9)  for  suitable  words 


t  - 

V  (Si  5  .  •  .  ,  Sjri  5  5  •  •  *  5  ^m) 

(^1 7  '  •  •  3  3  ^1  3  *  •  •  5  )  • 


Hence  by  the  induction  hypothesis  we  get 

p{h)  <  {m-  l)[m{h  -  1)  +p(h  -  1)] 

<(n-l)(/i-l)/2[(n-l)(/i-2)/2  +  l  +  n''-‘(/i-l)!]  <  n'‘W , 


which  concludes  the  proof. 

Remark  21.  We  can  also  use  Thm.  20  in  order  to  prove  the  main  result,  Thm.  12. 
Recall  that  the  main  difficulty  consists  in  replacing  a  trace  equation  of  the  form 
xy  =  .2,  where  x,y,z  e  J7.  By  Thm.  20  we  simply  replace  such  an  equation 
xy  =  zhy  d.  disjunction  over  clauses  of  the  form 

x  =  xi--xm  &  y  =  yi'’'yp  ^ 

^  ■  ■  ■  •^7r(m+p)  ^  alph(^j)  c  Ai  , 

for  all  1  <  m  <  +  1,  1  <  P  <  n^n!,  tt  G  S'^+p  and  Ai  C  E.  Here  Xi,yj 

denote  new  variables  and  Zi  —  Xi  for  1  <  i  <  ni,  resp.  Zm-\-j  —  yj  foi’  1  ^  i  ^  P* 
S'4+p  denotes  the  set  of  permutations  over  {1, . . . ,  m  +  p}  such  that  for  i  <  j 
the  inequality  7r(z)  >  7r{j)  implies  Ai  x  Aj  C  1.  This  reduction  of  a  single  trace 
equation  to  word  equations  roughly  yields  an  increase  in  the  number  of  word 
equations  by  (A^  +  2)!2^(^+^^  where  N  n^nl  +  (n  -  if  12  +  1.  Hereby  we  need 
N  additional  unknowns. 


We  conclude  this  section  with  two  remarks  concerning  the  parallel  complexity 
of  computing  lexicographical  normal  forms.  We  consider  uniform  circuit  com¬ 
plexity  classes  like  AC°  and  TC°.  Let  f\E*  ->  17*  be  a  function  such  that 
|/(u))|  —  p(l'i^l)  for  some  polynomial  p  and  every  w  £  S*.  Let  k  >  0.  Then 
/  is  AC^-computable  if  there  is  a  family  {(7n)n>o  of  polynomial-size  circuits  of 
depth  0(log^(n))  with  AND  and  OR  gates  of  unbounded  fan-in/out  and  unary 
NOT  gates,  such  that  (7|^|  computes  f{w)  for  all  it;  G  i7*.  A  function  /  is  TC^- 
computable  if  there  is  a  family  of  circuits  as  above  which  in  addition  to  AND, 
OR  and  NOT  gates  contain  MAJORITY  gates  of  unbounded  fan-in/out.  A  MA¬ 
JORITY  gate  yields  1  if  and  only  if  more  than  half  of  its  inputs  are  1.  In  order  to 
be  able  to  deal  with  arbitrary  alphabets  E  one  usually  assumes  that  the  circuits 
have  special  input/output  gates  testing  x  =  a  lor  each  input  position  x  and 
letter  a  £  E  (analogously  for  the  outputs).  Uniformity  means  that  given  n  >  0 
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(a  fixed  coding  of)  the  circuit  Cn  can  be  easily  computed  (e.g.  in  logarithmic 
space).  It  is  not  very  hard  to  verify  that  AC*  C  TC*  C  AC*^^,  k  >0.  For  more 
details  about  circuit  complexity  see  e.g.  [15].  We  state  the  results  below  without 
proofs  (being  sketched  in  [6]).  With  Thm.  20  we  obtain 

Corollary  22.  Let  {S,I)  denote  an  independence  alphabet. 

Then  we  can  compute  \ex{st)  on  input  s,t  €  LNF  in  uniform  AC®. 

Remark  23.  We  could  apply  Cor.  22  in  order  to  compute  the  function  lex  in 
AC\  However,  we  can  do  better;  the  mapping  lex:  E*  LNF  is  computable  in 
uniform  TC®.  This  result  can  be  compared  with  the  fact  that  the  equivalence 
5  =/  t  can  be  verified  in  uniform  TC®,  too  (see  [2]). 
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Abstract.  We  exhibit  a  first-order  definable  picture  language  which  we 
prove  is  not  expressible  by  any  star-free  picture  expression,  i.  e.,  it  is  not 
star- free.  Thus  first-order  logic  over  pictures  is  strictly  more  powerful 
than  star-free  picture  expressions  are.  This  is  in  sharp  contrast  with 
the  situation  with  words:  the  well-known  McNaughton-Papert  theorem 
states  that  a  word  language  is  expressible  by  a  first-order  formula  if  and 
only  if  it  is  expressible  by  a  star- free  (word)  expression. 

The  main  ingredients  of  the  non-expressibility  result  are  a  Fraisse-style 
algebraic  characterization  of  star  freeness  for  picture  languages  and  com- 
liinatorics  on  words. 


1  Introduction 

There  are  two  fundamental  results  connecting  logical  definability  with  concepts 
in  the  theory  of  regular  languages:  1)  Biichi’s  theorem  (see  [1])  which  states  that 
a  word  language  is  recognized  by  a  finite  automaton  if  and  only  if  it  is  definable 
in  (existential)  monadic  second-order  logic,  and  2)  McNaughton  and  Papert’s 
theorem  (see  [9])  which  says  that  a  word  language  is  star-free  if  and  only  if  it  is 
definable  in  first-order  logic.  In  [6],  it  was  shown  that  the  first  result  essentially 
carries  over  to  picture  (or  “two-dimensional”)  languages  in  the  following  sense: 
a  picture  language  is  recognized  by  a  tiling  system  if  and  only  if  it  is  definable 
in  existential  monadic  second-order  logic  (while  in  [5],  see  also  [6],  full  monadic 
second-order  logic  had  been  proven  to  be  strictly  more  powerful). 

In  this  paper,  we  show  that  the  second  result  does  not  carry  over  to  pic¬ 
ture  languages.  More  precisely,  we  exhibit  a  simple,  first-order  definable  picture 
language,  denoted  (see  page  3),  and  show  that  is  not  expressed  by  any 
star-free  picture  expression.  On  the  other  hand,  it  is  straightforward  to  see  that 
every  star-free  picture  language  is  definable  in  first-order  logic.  We  thus  con¬ 
clude  that  the  class  of  star-free  picture  languages  is  strictly  contained  in  the 
class  of  first-order  definable  picture  languages.  This  clarifies  an  interesting  ques¬ 
tion  about  the  fine  structure  of  the  class  of  all  recognizable  picture  languages, 
which  was  brought  up  in  [6].  It  should  also  be  noted  that  by  a  result  from  [5],  the 
class  of  first-order  definable  picture  languages  is  strictly  contained  in  the  class 
of  all  recognizable  picture  languages. 

As  with  star-free  word  expressions,  star-free  picture  expressions  are  built  from 
singleton  sets  using  boolean  combinations  and  concatenation.  Of  course,  due  to 
the  two-dimensional  structure  of  pictures  there  are  two  kinds  of  concatenation: 
“horizontal”  and  “vertical” ,  sometimes  also  called  “row”  and  “column”  concate¬ 
nation.  Similarly,  in  first-order  formulas  over  pictures  one  can  use  a  “horizontal” 
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and  a  “vertical”  order  relation  to  specify  spatial  relations  between  positions.  It 
is  the  unrestricted  use  of  these  two  order  relations  that  makes  first-order  logic 
over  pictures  more  powerful  than  star-free  expressions. 

The  proof  that  L+  is  not  star-free  is  based  on  a  characterization  of  star-free 
picture  languages  in  the  style  of  Fraisse’s  algebraic  characterization  of  first-order 
definability  ([3],  see  also  [2]).  The  other  ingredient  of  the  proof  is  an  encoding 
of  certain  pictures  by  words,  which  allows  us  to  apply  “one-dimensional”  com¬ 
binatorial  arguments.  Although  Fraisse’s  idea  is  quite  old,  this  is  the  first  time 
that  it  has  been  applied  to  a  problem  on  picture  languages. 

The  paper  is  organized  as  follows.  In  Section  2,  basic  terminology  and  no¬ 
tation  is  introduced  and  the  main  result  is  stated.  Section  3  then  describes  the 
algebraic  characterization  of  star  freeness,  Section  4  focuses  on  the  encoding  of 
“diagonal”  pictures  in  words,  in  Section  5  a  combinatorial  lemma  about  words 
is  established,  and  in  Section  6,  the  proof  of  the  main  theorem  is  completed. 

For  a  survey  on  picture  languages,  see  the  forthcoming  handbook  chapter  [4]. 

Thanks  to  Kousha  Etessami,  Oliver  Matz,  and  Sebastian  Seibert  for  fruitful 
discussions  and  comments  on  drafts  of  this  paper. 


2  Basic  Terminology  and  Main  Result 


A  picture^  over  an  alphabet  A  is  a  matrix  with  entries  from  A.  We  say  (m  x  n)- 
picture  for  a  picture  with  m  rows  and  n  columns.  An  atomic  picture  is  a  (I  x  1)- 
picture.  Words  can  and  should  be  thought  of  as  (1  x  n)-pictures. 

There  are  two  concatenations  defined  for  pictures:  juxtaposition  and  supra- 
position.^  The  juxtaposition  of  an  (m  x  n)-picture  with  an  [m'  x  n')-picture  is 
defined  when  m  =  m!  and  is  the  (m  x  (n  -f  n'))-picture  denoted  PcdQ  where 


{Pn2Q)ij 


Pij  if  j  <  n, 
Qij-n  if  j  >  n. 


(1) 


The  supraposition  of  P  and  Q  is  defined  when  n  —  n!  and  is  the  ((m-|-m')  x  7?.)- 
picture  denoted  F  0  Q  where 


(pBq).-,-  =  f  5^' 


if  i  <  777, 
if  i  >  777. 


(2) 


Juxtaposition  and  supraposition  are  extended  to  sets  of  pictures  just  as  concate¬ 
nation  of  words  is  extended  to  sets  of  words. 

A  star- free  picture  expression  over  an  alphabet  A  is  built  from  the  letters  of 
A  (each  letter  a  standing  for  the  singleton  set  with  the  atomic  picture  a)  using 
the  additional  symbols  0  (for  the  empty  set),  -f  (for  set-theoretic  union),  ^  (for 
set-theoretic  complementation  with  respect  to  the  set  of  all  pictures  over  A) ,  and 
cn  and  B .  Each  star-free  picture  expression  over  A  defines  a  picture  language  over 

^  Not  to  be  confused  with  the  notion  of  picture  defined  in  [8]. 

^  “Juxtaposition”  and  “supraposition”  are  also  known  as  “horizontal”  and  “vertical” 
as  well  as  “row”  and  “column”  concatenation. 
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A  in  a  canonica!  way.  For  instance,  given  an  alphabet  A  with  two  letters  a  and 
6,  the  expression  a  +  (a  Q~0)  +  (aQii^0)  +  (nci]~0)B~0  defines  the  set  of  all 
pictures  whose  upper  left  entry  is  a.  Notice  that  we  don’t  consider  the  empty 
picture.  A  picture  language  is  said  to  be  star-free  if  it  can  be  expressed  by  a 
star- free  picture  expression. 

The  first-order  vocabulary  we  use  consists  of  built-in  predicates  Kv  S'^id  </* 
for  horizontal  and  vertical  order  relation  and  a  unary  predicate  Pa  for  each 
letter  a.  First-order  formulas  in  this  language  are  interpreted  in  pictures,  where 
the  first-order  variables  range  over  the  positions  of  the  picture  in  question.  For 
example,  consider  the  formula 

3x3yi  .  .  .3j/4(j/i  <h  x  <h  y2  ^  Vs  <v  x  <y  y4  A  PiX  A  Pm  A  ...  A  Pm)  .  (3) 

This  formula  defines  the  set  of  all  pictures  satisfying  the  following  condition: 
there  is  a  position  labeled  1  to  the  left  and  right  of  which  there  is  an  occurrence 
of  1  and  over  and  under  which  there  is  an  occurrence  of  1.  We  write  L+  for  the 
picture  language  containing  all  pictures  over  {0, 1}  satisfying  (3). 

The  main  result  of  this  paper  is: 

Theorem  1.  The  language  L+  is  not  star- free. 

Every  star-free  picture  expression  can  be  converted  into  an  equivalent  first- 
order  sentence  in  a  straightforward  way,  in  fact,  by  reusing  variables  one  can 
even  show  that  five  first-order  variables  are  always  sufficient.  (The  interested 
reader  may  want  to  notice  that  in  order  to  define  jL+  two  variables  are  actually 
enough.) 

As  a  consequence,  we  have: 

Corollary  2.  The  class  of  star-free  picture  languages  is  strictly  contained  in  the 
class  of  first-order  definable  picture  languages. 

In  the  notation  of  [4],  we  thus  have:  >C(SFRE)  C  £(FO)  C  £(EMSO)  C  £(MSO). 

3  Algebraic  Characterization  of  Star  Freeness 

Fix  an  alphabet  A  and  let  ^  >  0.  There  are  only  a  finite  number  of  picture 
languages  over  A  that  can  be  defined  by  star-free  picture  expressions  over  A  of 
nesting  depth  at  most  k  in  the  concatenation  operations.  The  set  of  all  these 
picture  languages  is  not  only  finite  but  also  a  boolean  algebra,  i.e.,  there  is  a 
finite  partition  of  all  pictures  over  A  such  that  an  arbitrary  picture  language  over 
A  is  definable  by  a  star-free  expression  of  concatenation  depth  at  most  k  if  and 
only  if  it  is  a  union  of  the  blocks  of  this  partition.  In  this  section,  we  describe 
this  partition  in  terms  of  the  corresponding  equivalence  relation. 

Concatenation  depth,  denoted  cd,  is  defined  by: 


cd(0)  =  cd(a)  =  0  , 
cd(^E)  =  cd(E)  , 
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cd(£^  +  F)  =  max(cd(^),  cd(F))  , 

cd(FmF)  =  cd(FBF)  =  max(cd(F),  cd(F))  +  1  , 


where  a  stands  for  an  arbitrary  letter  and  E  and  F  stand  for  arbitrary  picture 
expressions. 

We  define  k -equivalence,  =k  in  symbols,  as  a  relation  over  pictures  inductively 
as  follows. 

1.  Pictures  are  0-equivalent  if  they  are  identical  or  both  not  atomic. 

2.  Pictures  P  and  Q  are  {k  -f  l)-equi valent  if  the  following  conditions  hold: 

(K)  P  and  Q  are  A:-equi valent. 

(J)  For  all  pictures  Pi,  P2  such  that  P  =  Pi  U3P2  there  exist  pictures  Qi, 
Q2  such  that  Q  =  Qi  m(52,  Pi^k  Qi,  and  P2  ~k  Q2- 
(S)  For  all  pictures  Pi,  P2  such  that  P  =  P1BP2  there  exist  pictures  Qi,  Q2 
such  that  Q  =  Q1BQ2,  Pi  =k  Qi,  and  P2  =k  Q2- 
(J')  &  (S')  Conditions  (J)  and  (S)  hold  when  the  roles  of  P  and  Q  are  ex¬ 
changed. 

This  means  P  and  Q  are  {k  +  l)-equivalent  if  and  only  if  they  are  A:-equivalent 
and  for  any  decomposition  of  P  into  two  pictures,  one  can  find  a  decomposition 
of  Q  into  two  pictures  such  that  corresponding  “factors”  are  Ar-equi valent,  and 
vice  versa. 

The  key  fact  about  this  equivalence  relation  is: 

Theorem  3  (correctness  and  completeness).  A  picture  language  L  is  de¬ 
finable  by  a  star-free  expression  of  concatenation  depth  at  most  k  if  and  only  if 
L  is  a  union  of  =k-clcisses. 

We  leave  out  the  proof,  which  follows  proofs  of  similar  claims  in  the  literature, 
see,  e.  g.,  [10],  where  the  fine  structure  of  the  class  of  all  star-free  word  languages 
is  characterized. 

Thus,  in  order  to  prove  that  a  picture  language  L  is  not  star-free  we  only 
have  to  show  that  for  every  k  there  are  two  pictures  P  and  Q  such  that  P  ^  L, 
Q  ^  L,  but  P  Q-  That  is  what  we  will  do  in  Section  6  (for  P  =  P+). 

We  need  some  facts  about  A:-equivalence  on  words,  all  of  which  can  be  proven 
by  a  straightforward  induction  on  k. 

Lemma  4  (projections).  Let  A  and  B  be  alphabets  and  'K'.A'^  — >•  a  homo¬ 
morphism.  If  u  and  V  are  strings  over  A  such  that  u  =k  v,  then  7r{u)  7r{v). 

Lemma  5  (congruence  property  for  words).  Let  k  >  0.  The  relation  =k 
restricted  to  words  is  a  congruence  relation,  i.  e.,  uu'  ^k  whenever  u  =k  ^ 
and  u'  v'  for  words  u,  u' ,  v,  and  P . 

Lemma  6  (aperiodicity).  For  each  k  >  0,  there  exists  Ik  y-  0  such  that 

y^ik+rn.  yik  every  word  u  and  every  in  >  0.  (4) 
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Corollary  7.  Let  k  >{).  Assume  u  is  a  word  of  the  form 


where  n>  k  I  and  ij  >  k  for  every  j  G  {1, .  .  . ,  n  -  1}. 

(a)  IfioAn  >  h:  and  v  is  obtained  from  u  by  changing  one  occurrence  of  1  to  0, 
then  u  y- 

(b)  If  V  is  obtained  by  changing  one  inner  (i.e.,  neither  the  first  nor  the  last) 
occurrence  of  1  to  0,  then  u  =k  v. 

We  also  need  the  following  very  simple  (and  weak)  congruence-like  property 
of  A;-equivalence  over  pictures,  which  can  also  be  proven  by  induction  on  k. 

Lemma  8.  Let  k>0,  I  >  0.  Assume  A  is  an  alphabet,  a  e  A,  P  is  an  {m  x  n)- 
picture  over  .4,  and  Q  is  an  (m'  x  n') -picture  over  A.  Define  P'  and  Q'  to  be 
the  unique  [in  x  /)-  and  (m^  x  l)~picture  over  the  alphabet  {u}.  If  P  —k  Q)  then 
PmP'  QojQ'-  The  dual  claim,  holds  for  supraposition  instead  of  juxtaposition. 


4  Diagonal  Pictures 

As  explained  above,  in  order  to  prove  that  is  not  star-free  we  have  to  find 
pictures  P^  and  such  that  P^  e  L+,  ^  L+,  and  P^  =k  for  every 

k.  We  will  choose  the  pictures  P^  and  from  a  class  of  specifically  designed 
pictures,  so-called  “diagonal  pictures”. 

We  will  introduce  diagonal  pictures  as  certain  pictures  over  {0, 1}  determined 
by  words  over  an  alphabet  denoted  by  D.  This  alphabet  is  defined  to  be  the  set 
of  subsets  of  the  five-element  set  C  =  {1,  n,  s,  w,  e),  where  n  stands  for  “north”, 
s  for  “south”,  etc.  Given  an  element  a  G  C  and  a  string  u  over  D,  the  a- 
projection  of  u,  denoted  u  |  a,  is  the  unique  string  v  over  {0, 1}  of  length  \u\ 
satisfying  Vj  =  1  if  and  only  if  a  G  Ui.  Given  a  word  u  of  length  I  over  D,  the 
corresponding  diagonal  picture,  P(u)  in  symbols,  is  given  by: 


’  Ul  J.  1 

U2  J.  n 

U3  J,  n  * 

•  Uj  J.  n  • 

•  Ul-l  i  n 

Ul  fn 

Uo  J,  W 

'U2  -I-  1 

0  • 

•  0  • 

0 

U2  ie 

Us  i  w 

0 

W3  >1  1  • 

•  0  • 

•  0 

7/3  1  e 

Ui  1  w 

0 

0  • 

0 

Ui  4^  e 

Ul-l  f  w 

0 

0  • 

••0- 

•  Ui-i  i  1  Ui-i  i  e 

Ul  i  W 

U2  f.  s 

t/3  1  S  • 

'  •  Ui  1  s  ■ 

•  Ul-i  s 

Ul  fl  _ 

Given  an  arbitrary  [m  x  7i)-picture  P  and  ro,  ri,  r2,  and  r^  with  1  <  ?’o  < 
?q  <  m  and  I  <  ro  <  rs  <  n,  we  define  the  subpicture  of  P  determined  by  ro, 
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. . .  ,  7’3,  denoted  P[ro,  7^,  r2,  or,  P[7’]  for  short,  to  be 

^ To.ra  PrQ,r2  +  l  '  '  '  ^ro,r^ 

Pro  +  l,r2  Pro  +  l,r2  +  l  '  '  '  ^ro  +  l,rs 

Pro  +  2,r2  ^ro+2,r2  +  l  ’  ’  '  ^ro-\-2,rs  _  ^7) 

-^J'i,r2  -^ri, 7*2  +  1  '  ■  ’  ^ri,r3 

As  with  diagonal  pictures,  we  describe  subpictures  of  diagonal  pictures  by 
words.  For  this,  we  use  two  additional  symbols;  h,  the  horizontal  clipping  mark, 
and  V,  the  vertical  clipping  mark.  We  write  E  for  the  sets  of  subsets  of  (7U{h,  v}. 
With  P(u)[r]  we  associate  the  word  u[r]  over  E  defined  by: 

“  |u[r]|  =  |w|, 

-  ii[r]i  nC  =  Hi  for  7  E  {1, . . . ,  \u\}, 

-  h  G  u[r]i  iff  ro  =  i  or  ri  =  i,  and 
“  V  E  u.[7^]i  iff  7*2  =  i  or  7*3  =  i. 

The  important  lemma  about  diagonal  pictures  is  the  following. 

Lemma  9.  Let  k  >{).  Assume  u  and  v  are  words  over  D  and  r  and  s  are  such 
that  u[r]  =7k+4  Then  P(«)[r]  =k  P(f)[s]. 

Proof.  The  proof  goes  by  induction  on  k. 

Induction  base,  k  =  0.  By  symmetry,  it  is  sufficient  to  show  that  under  the 
assumption  7/.[7'^]  =4  7;[s],  if  atomic,  then  so  is  P(t’)[s]  and  both  pictures 

are  identical. 

In  general,  a  picture  P{u)[f]  is  atomic  if  and  only  if  ro  =  ri  and  r2  =  rs.  This 
is  true  if  and  only  if  u[r]  contains  exactly  one  position  i  such  that  h  E  u[r]i  and 
exactly  one  position  j  such  that  v  E  Whether  or  not  this  is  true  is  easily 

seen  to  be  determined  by  the  4-equivalence  class  of  ii[r].  Hence,  if  u[r]  =4  ?;[s] 
and  P(u)[7”]  is  atomic,  then  P(i;)[s]  is  atomic  as  well. 

Furthermore,  if  a  picture  P(i/)[r]  is  atomic,  then: 


P{«)[r]  = 


u[f]ra  4,  1  if  ro  =  r2, 
u[f]r^  fiL  if  1  =  7*0  <  r2, 

7/[r]ro  I  VI  if  1  =  r2  <  ro, 

u[f]r^  is  if  1  <  r2  <  7*0  =  |7i| 

«[^]ro  I  e  if  1  <  ro  <  r2  =  |7i| 

0  otherwise. 


It  is  now  easily  seen  that  the  order  relation  between  ro  and  r2  as  well  as  which 
of  these  two  values  is  1  or  |u|  is  determined  by  the  4-equivalence  class  of  u[?^], 
i.e.,  if  w[7^]  =4  7;[s]  and  P(7/)[r]  is  atomic,  then  we  are  in  the  same  of  the  above 
cases  for  both  pictures  P(7i)[r]  and  P(r)[s]. 

Also,  if  7/[7'^]  =4  r[s],  7*0  =  7*1,  7*2  =  r3,  So  =  Si,  and  S2  =  S3,  then  n[7“],.Q  = 
and  7/[r]ri  =  Thus,  if  u[r\  =4  r[s]  and  P(i£)[r]  is  atomic,  then 

P(7/.)[7'^]  and  P(?;)[s]  are  identical. 
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Induction  step.  Assume  that  for  all  k'  <  k  and  all  f'  and  s',  if  u[r^]  =7A:'+4 
then  P(i/)[r^]  =k'  Assume  also  that  u[r]  =7;e+ii  v[s]-  Write  P  and 

Q  for  P(n,)[r]  and  P(7;)[5],  respectively.  We  want  to  show  P  ~k+i  Q- 

First,  notice  that  we  have  P  Q  by  induction  hypothesis,  as  u[f]  =7^+11 
v[s]  implies  u[r]  =7k+4  '^^[5].  So  (K)  holds.  What  we  need  to  show  in  addition  is 
that  (J),  (S),  (J'),  and  (S')  hold.  By  symmetry,  it  is  enough  to  consider  only  (J). 

Let  Pi  and  P2  be  such  that  P  =  Pi  mP2.  There  exists  such  that  r2  <  r 4, 
7’4  +  1  <  7*3,  and 

Pi  P(u)[ro,ri,r2,r4]  , 

P2  P(77)[ro,ri,r4+  l,r3]  . 

In  the  rest  of  this  proof  we  will  only  analyze  the  situation  where  r2  <  ^4  and 
7^4  -f  1  <  rs;  the  other  three  cases  (where  r2  =  7^4  and  r4  +  1  ==  rs ,  or  r2  <  7^4  and 
r4  4-  1  =  r3,  or  r2  =  r4  and  r4  +  1  <  r3)  are  simpler  and  can  be  dealt  with  in  a 
.similar  way. 

There  exist  unique  (possibly  empty)  words  ui,  U2,  U3,  and  U4  and  letters  Ui, 
0,2,  «'3,  and  0,4  such  that 

u[r]  —  ui(ai  U  {^})u2a2a3Uz{a4  U  {v})u4  , 

np’2,  ^^4]  =  ^1(01  u  {v})u2{a2  u  {v})a3U3a4U4  , 

u[ro,ri,  r4  +  l,r3]  =  uiaiU2a2{a3  U  {v})7i3(a4  U  {v})u4  • 

Since  we  assume  u[r]  =7fc+ii  we  can  conclude  there  are  7;i,  V2,  V3,  and  7;4 
such  that 

-  't;[s]  =  7;i(ai  U  {v})7;2a2«3V3(a4  U  {v})7;4,  and 

-  Ui  =7At+4  ^1,  '11-2  =7/v+4  ^2,  U3  =7k+4  ^3,  and  U4  =7k+4  ^4- 

Let  S4  =  |i’iai7;2a2|  and  define 

Qi  =  P(7;)[so, 51,52,54]  , 

Q2  =  T’(7;)[5o,  5i,  54  +  1,  S3]  . 

Then  Q  =  Qi  m(32,  and  to  finish  the  proof  we  need  only  show  Pi  Qi  and 
P2  =A:  Q2- 

From  the  definition  of  S4,  we  know: 

^[50,51,52,54]  =  7;i(ai  U  {v})7;2(a2  U  {v})a3'^^3a4^^4  , 

7;[so,5i,  54  +  1,  S3]  =  7;iaii;2a2(a3  U  {v})773(a4  U  {v})7;4  . 

Since  {Ik  +  4)-equivalence  is  a  congruence  relation  on  words,  we  obtain: 

7/1  (ai  U  {v})7/,2(a2  U  {v})a3U30.4U4  ~7k+4  yi{ai  U  {v})7;2(a2  U  {v})u37;3a47;4  , 
77.iai77.2a2(a3  U  {v})u3(a4  U  {v})7i4  =7k+4  ViaiV2a2(a3  U  {v})7;3(a4  U  {v})7;4  . 
This  implies,  by  the  induction  hypothesis, 

P(u)[ro,ri,r2,r4]  =k  P(v)[so,  si,  S2,  S4]  , 

P(u.)[7’o,  ri,  r4  +  l,r3]  P(n)[ro,ri,r4  +  l,r3]  , 

hence  Pi  =k  Qi  and  P2  =k  Q2- 
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5  A  Combinatorial  Lemma  about  Words 

As  said  above,  we  will  define  the  pictures  and  we  are  looking  for  as  certain 
diagonal  pictures:  we  will  set  =  P(sk)  and  =  P(4)  for  appropriate  words 
Sk  and  tk  with  specific  combinatorial  properties.  The  building  blocks  in  the 
construction  of  these  words  are  the  words  described  in  the  following  lemma. 

Lemma  10.  Let  A  be  an  arbitrary  finite  alphabet,  a  e  A,  and  k,m  >  0.  there 
exists  a  word  Uk^m  such  that 

-  all  XDords  in  {ut k~eqmmlent, 

-  iik,m  e  {a>^Aya>^^,  and 

-  Uk,m.  €  A*{bA*)'^'^^  for  every  b  £  A. 

Proof  By  induction  on  k.  Let  A  =  {ai, . . . ,  and  assume  a  —  ai. 

Induction  base.  For  ^  =  0,  the  following  choice  is  obviously  correct: 

uo.m  =  aiai(aTa2a’rasa’[‘---aTa„aTr+'^  ■ 

Induction  step.  Suppose  Uk,m  is  a  word  such  that  the  above  three  conditions 
hold.  Set 

Uk  +  l,rr}.  —  Uk^m'^k,m(^l‘^^'k,rn'^k,rn^2'^k,rn'^k,m  •  •  ' ‘^k ,rn'^k ^m^n'^k ,rn'^k ,rn  ■  (8) 

We  claim  that  this  choice  is  correct.  The  second  and  third  condition  are  obviously 
satisfied. 

By  the  induction  hypothesis,  all  words  in  {'^k+i,m^)* '^k+i,m  ^-equivalent. 
Furthermore,  if  u  =  u'u"  is  a  decomposition  of  a  word  from  the  set  denoted 

t’y  ‘hen 

-  there  exist  iv,  «/,  w" ,  and  w”'  such  that  u'  -  wwf  u"  =  w'w"  G 

Uk.mAuk^m,  and  w  =k  Uk,m  =k  w'"  (by  the  induction  hypothesis);  or 

-  li//!  <  \uk,7n\  and  there  exists  w  and  such  that  idw  =  Uk,m, 
and  tP  =k  '^ik,m  (by  the  induction  hypothesis);  or,  symmetrically, 

-  and  there  exists  w  and  if;'  such  that  w'u"  —  Uk,m,  =  w', 
and  lu  =k  Uk^m  (by  the  induction  hypothesis). 

On  the  other  hand,  every  word  from  (u^^^  allows  all  the  decompo¬ 

sitions  described  above.  Therefore,  all  words  in  {'^k+i,m^y'^k+i,7n 
equivalent. 

6  Tying  Things  Together 

As  pointed  out  in  Section  3,  all  we  need  to  do  in  order  to  prove  that  L+  is  not 
star-free  is  to  define  pictures  P^  G  L-f.  and  ^  L+  and  show  P^  ^k  ,  tor 
k  >  0. 

For  notational  convenience,  write  +  for  {1,  n,  s,  w,  e}  and  “r  for  {1,  s,  w,  e}. 
Let  ivk  be  the  word  usk,2hk  horn  Lemma  10  with  A  =  jC)\{-f}  and  a  —  0.  Set 
Sk  =  Wk-hwk  and  tk  =  Wk0Wk  and  define  P^  =  P(sk)  and  =  P(tk)-  These 
are  the  pictures  we  are  looking  for: 
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Proposition  11.  For  k  >  0, 

L  and  ^  F+,  and 

2.  P^'  =k  Q^'  for  all  k'  >  k. 

Proof.  The  first  claim  is  obvious  (cf.  (6)). 

The  proof  of  the  second  claim  goes  by  induction  on  k.  The  induction  base, 
=  0,  is  trivial.  In  the  induction  step,  we  assume  P^  =k  for  all  k'  >k  and 
need  to  show  P^  =fc+i  for  all  k'  >  k. 

Let  k'  >  k.  Write  P  and  Q  for  and  Q^' .  We  have  to  verify  (F),  (J),  (S), 
(J'),  and  (S').  By  induction  hypothesis,  we  know  P  =k  Q,  hence  (K)  holds.  Of 
the  other  four  requirements,  we  will  only  consider  (S)  in  the  rest  of  this  proof. 
That  (J),  (J'),  and  (S')  hold  can  be  proven  in  a  similar  fashion. 

Let  P  =  Pi  0P2-  Without  loss  of  generality,  assume  Pi  has  less  rows  than 
P2.  (Notice  that  Pi  and  P2  cannot  have  the  same  number  of  rows.) 

Also,  assume  that  Pi  and  P2  have  at  least  2  rows.  If  this  is  not  the  case,  the 
situation  is  simpler  but  changes  would  have  to  be  made  to  the  notation  in  the 
following. 

We  have  to  find  Qi  and  Q2  such  that  Q  —  Q1BQ2,  Pi  =k  Qi,  and  P2  =k  Q2- 
Let  p  be  the  number  of  rows  of  P  (which  is  also  the  number  of  columns  of 
P  and  the  number  of  rows  and  columns  of  Q)  and  pi  the  number  of  rows  of  Pi. 
Then  Pi  =  P(wk'+Wk>)[l,Pi,  l,p]  and  P2  =  P{wk>+Wk>)\pi  +  l,p,  l,p]- 
Write  as  ais'a2a3S^^  '"(24  such  that 

{wk>^Wk>)[l,Pu  =  («!  U  {h,  v})s'((22  U  {h})a35"+s''^(a4  U  {v})  , 
{wk>+Wk>)[pi  +  l,p,  l,p]  =  (ai  U  {v})s'a2(a3  U  {h})s"+s'"((24  U  {h,  v})  . 

By  definition  of  Wk',  we  know  Wk>~rwk>  =8k'  ruk'.  Therefore,  there  exist  t'  and 
t"  such  that  iVk'  —  ait'a^a^t'^  and 


s'  =8/c'-4  t'  J  (9) 

^8k‘~A  t”  •  (10) 

Let  qi  =  |ait'a2|,  and  define 

Qi  P(wkSwk>)[l,qi,l,P\  5  (11) 

Q2  =  P{wkSwk>)[qi  +  l,p,l,p]  .  (12) 


Clearly,  Q  =  Qi^Q2-  To  conclude  the  proof,  we  show  Pi  ^k  Qi  and  P2  Q2- 
Proof  of  Pi  =/c  Qi.  First  note  the  following.  Since  =8fc'-4  is  a  congruence 
relation,  (10)  implies  s"^iUkSwk>  =sk>~A  t"$Wk',  which,  in  turn,  by  assumption 
about  Wk>,  implies 

s"-rwk>  =8/v'-4  t"(I\wk>  ,  (13) 

hence 

// _ ///  _  411CKJ11 

S  nrs  z=:Sk>-5  t  VS 

We  now  proceed  by  a  case  distinction  on  |8"|. 


(14) 
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First  case,  |s"|  >  First  of  all,  observe 

Pi  =  P{ais'a2a3(s"-hs'"  in)a^)[l,pi,l,p]  ,  (15) 

Qi  =  P(ai^'a2a3(/'W'Un)a4)[l,^i,l,rf  •  (16) 

Since  |s"|  >  hk',  we  can  use  Corollary  7  (part  (a)  or  (b))  in  combination  with 
the  definition  of  Wk>  to  conclude: 

s"-\-s'"  I  TL=7k'  s"-rs"'  I  n  .  (17) 

This,  together  with  (14)  and  Lemma  4,  yields 

J,  n  =7k>  t”^s'”  f.  n  .  (18) 


Using  the  congruence  property  again  and  combining  (9)  and  (18),  we  obtain: 

(ai  U  {h,  v})s'(a2  U  {h})a3(s''+s^"  i  n)(a4  U  {v}) 

=7/c'-3  (<^1  U  {h,  v})U(a2  U  {h})a3(t  0s  |.n)(a4U{v})  , 


which  means,  as  F  >  k, 

{ai s' 0.20.3(8" -rs”'  f.  n)a4)[l,pi,  l,p]  =7k+4  P(ait'a2a3{t"^s'"  |  n)a4)[l,  qi,  l,p]  . 

From  this,  (15),  and  (16),  together  with  Lemma  9,  now  follows  Pi  =k  Qi- 

Second  case,  |s^'|  <  hk’ >  Write  I  for  hk’ ^  Then  |s^|  >  /,  and,  by  construction 
of  Wk>,  we  can  write  s'  as  So0^  for  an  appropriate  sq.  Define  pictures  R  and  R' 
as  follows: 


R  -  P{aiSQ(^^ 02035" -\-Wk>  n))[l,  |aiSo|,  l,p]  , 

R'  =  P(aiso(0^a2a3s''0u;fc'  4-  n))[l,  |aiSo|,  l,p]  • 

We  have  02035" a^-\-Wk'  i  n  ^7k'  02038" oS'^k>  4-  n  by  Corollary  7(a),  hence 

R.  =k  R'  by  Lemma  9 . 

Let  Z  be  the  unique  ((/  +  !)  x  p)-picture  over  {0}.  Then,  by  Lemma  8, 
PBZ  =k  R'BZ.  On  the  other  hand,  PBZ  =  Pi  (recall  that  02  =  0  by  definition 
of  Wk').  So  for  the  rest  it  is  enough  to  show  R'BZ  =k  Qi- 
By  construction  of  R'  and  Z,  we  know 

P'0Z  =  P(ais'a2a3S^'0'LCfe/)[l,pi,  l,p]  .  (19) 

Combining  (9)  and  (13),  we  obtain 

{ais'o.20.3s"~rwk>)[l,pi,l,p]=8k'-4  {oit'a2a3i"^Wk>)[l,qi,l,p]  •  (20) 

Thus,  by  Lemma  9,  R'BZ  =k  Qi- 

Proof  of  P2  =k  Q2-  Combining  (9)  and  (14),  we  obtain 

(fli  U  {v})s'a2(a3  U  {h})s"-rs'''(a4  U  {h,  v}) 

=8/c'-5  (<^i  U  {v})t^a2(a3  U  {)x})t"^s'" (a^  U  {h,  v})  , 
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which  means 

{ai s' (120.35" ~riVk>)lpi  +  l,p,  l,p]  =7^+4  (ait' 02031" idwk')[qi  +  l,p,  l,p]  • 

Using  Lemma  9,  we  conclude 

P{wk>~rivk')[pi  +  l,p,  l,p]=A:  Piwk>^'OJk>)[qi  +  l,p,  l,p]  , 

which  means  Pn  ^k  Q2- 

7  Concluding  Remarks 

We  have  seen  that  the  class  of  star-free  picture  languages  is  strictly  included  in 
the  class  of  first-order  definable  picture  languages,  which  clarifies  an  important 
aspect  of  the  fine  structure  of  the  class  of  all  recognizable  picture  languages. 

One  obvious  question  is:  what  happens  when  the  power  of  star- free  picture 
expression  is  enhanced,  for  instance,  by  introducing  a  concatenation  with  four 
arguments?  The  proof  methods  presented  here  yield  the  following  result:  star-free 
picture  expressions  are  strictly  less  expressive  than  star-free  picture  expressions 
augmented  by  the  four-place  concatenation,  and  these  expressions  are  strictly 
less  expressive  than  first-order  logic. 

The  second  question  that  is  interesting  here  is  whether  there  is  a  constant 
k  such  that  each  first-order  sentence  over  pictures  is  equivalent  to  a  first-order 
sentence  using  k  variables.  This  is  true  for  words  and  ^  =  3,  see  [7]. 
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Abstract.  We  present  a  simple  method  to  associate  rewards  with  terms 
of  the  stochastic  process  algebra  EMPA  in  order  to  make  the  specification 
and  the  computation  of  performance  measures  easier.  The  basic  idea 
behind  this  method  is  to  specify  rewards  within  actions  of  EMPA  terms, 
so  it  substantially  differs  from  methods  based  on  modal  logic.  The  main 
motivations  of  this  method  are  its  ease  of  use  as  well  as  the  possibility 
of  defining  a  notion  of  equivalence  which  relates  terms  having  the  same 
reward,  thus  allowing  for  simplification  without  altering  the  performance 
index.  We  prove  that  such  an  equivalence  is  a  congruence  finer  than  the 
strong  extended  Markovian  bisimulation  equivalence,  and  we  present  its 
axiomatization. 


1  Introduction 

A  conimonlv  used  method  to  specify  steady-state  performance  measures  for 
Markovian  models  is  based  on  rewards  [6].  The  basic  idea  is  that  a  number  de¬ 
scribing  a  reward  (or  weight)  is  attached  to  every  state  of  the  Markovian  model, 
and  the  performance  index  is  defined  as  the  weighted  sum  of  the  steady-state 
probabilities  of  the  states  of  the  Markovian  model. 

So  far  the  specification  of  performance  measures  in  the  field  of  stochastic 
process  algebras  has  received  a  scarce  attention.  The  main  negative  consequence 
is  that  the  whole  Markovian  model  underlying  a  given  term  has  to  be  manually 
scanned  by  the  designer  in  order  to  assign  rewards  to  states. 

Recently,  in  [3]  a  technique  to  formally  specify  rewards  for  the  stochastic 
process  algebra  PEPA  [5]  has  been  proposed.  The  idea  is  to  express  rewards 
by  means  of  the  Hennessy-Milner  logic  [4]:  a  logical  formula  is  specified  together 
with  an  arithmetical  expression,  and  every  state  satisfying  the  formula  is  assigned 
the  reward  specified  by  means  of  the  arithmetical  expression.  We  shall  call  such 
a  method  logic-based. 

The  idea  of  describing  rewards  through  a  modal  logic  seems  to  be  quite 
adequate  because  modal  logic  formulae  make  assertions  about  changing  state, 
hence  they  constitute  an  adequate  link  between  algebraic  terms,  which  describe 
the  behavior  of  concurrent  systems,  and  rewards,  which  are  associated  with 
states. 

In  this  paper  we  propose  a  different  way  to  associate  rewards  wuth  terms  of 
stochastic  process  algebras.  The  idea  is  not  to  use  a  separate  formalism  in  order 
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to  specify  rewards:  they  are  directly  described  within  the  actions  forming  the 
algebraic  terms.  This  method,  which  we  shall  call  algebra-based,  closely  resembles 
the  manual  method  consisting  of  associating  rewards  while  scanning  the  state 
space  of  the  Markovian  model:  the  difference  is  that  in  the  algebra-based  method 
the  algebraic  term,  which  is  much  more  compact  than  its  underlying  state  space, 
is  scanned  and  the  appropriate  actions  are  assigned  a  reward.  The  algebra-based 
method  could  be  convenient  due  to  its  ease  of  use,  since  the  designer  is  not  forced 
to  know  the  modal  logic  formalism,  its  low  computational  cost,  as  rewards  are 
associated  with  states  during  the  construction  of  the  semantic  models  without 
the  need  to  check  for  a  modal  logic  formula,  and  the  possibility  of  defining  a 
congruence  which  equates  terms  having  the  same  reward,  thereby  allowing  for 
simplification  without  altering  the  performance  measure. 

The  purpose  of  this  paper  is  to  extend  the  theory  developed  for  the  stochastic 
process  algebra  EMPA  [1,  2]  in  order  to  deal  with  rewards  according  to  the 
algebra-based  method.  In  Sect.  2  we  show  that  several  performance  measures 
can  be  derived  using  the  algebra-based  method.  In  Sect.  3  we  introduce  the 
syntax  and  the  semantics  for  EMPA  augmented  with  rewards.  In  Sect.  4  we 
define  an  equivalence  which  relates  two  terms  if  they  have  the  same  reward,  we 
prove  that  such  an  equivalence  is  a  congruence  strictly  contained  in  the  strong 
extended  Markovian  bisimulation  equivalence,  and  we  present  its  axiomatization. 
In  Sect.  5  we  report  some  concluding  remarks. 


2  Deriving  Performance  Measures 

In  this  section  we  show  by  means  of  an  example  that  the  algebra-based  method 
we  are  going  to  introduce,  though  less  powerful  in  general  than  the  logic-based 
method  proposed  in  [3],  allows  the  designer  to  easily  specify  several  steady-state 
performance  measures  frequently  occurring  in  practice  such  as  those  identified 
in  [3]:  rate  type  (e.g.  throughput  of  a  service  center),  counting  type  (e.g.  mean 
number  of  customers  waiting  in  a  service  center),  delay  type  (e.g.  mean  response 
time  experienced  by  customers  in  a  service  center),  and  percentage  type  (e.g. 
the  fraction  of  time  during  which  a  server  is  busy). 

The  example  we  consider  is  taken  from  queueing  theory,  and  concerns  a 
queueing  system  M/Mfnfn  with  arrival  rate  A  and  service  rate  fj,  [7].  Such  a 
queueing  system  represents  a  service  center  composed  of  n  independent  servers, 
such  that  the  customer  interarrival  time  is  exponentially  distributed  with  rate  A 
and  the  service  time  of  each  server  is  exponentially  distributed  with  rate  /x.  The 
queueing  system  at  hand  can  be  given  two  different  descriptions  with  EMPA: 
a  state-oriented  description  where  the  focus  is  on  the  state  of  the  set  of  servers 
(intended  as  the  number  of  servers  that  are  currently  busy),  and  a  resource- 
oriented  description  where  the  servers  are  modeled  separately  [9].  Recalling  that 
“<a,  A>._”  is  the  prefix  operator  where  a  is  the  action  type  and  A  is  the  action 
rate  (a  positive  real  number  in  the  case  of  exponentially  timed  actions,  ooi^^ 
in  the  case  of  prioritized  weighted  immediate  actions,  and  *  in  the  case  of  pas¬ 
sive  actions),  “..-f-  J’  is  the  alternative  composition  operator,  and  "-\\s  is  the 
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parallel  composition  operator  with  synchronization  set  S,  the  state-oriented  de¬ 
scription  is  given  by 

Sys1em.MI,u/n/r,  =  Arrivals\\{a)  Serverso 
Arrivals  =  <a,  X>. Arrivals 
Serverso  =  <o,  ♦>.5eri'er5i 

Serversh  =  <a,  *>.Serversh+i  +  <5,  h  •  fM>.S€rversh-i,  1  <h  <n-  I 

Serversn  =  <s^n  '  fj,>.ServerSn-i 
whereas  the  resource-oriented  description  is  given  by 

System''MIMInln  -  ||{a}  Servers 

Arrivals  =  <a,  Arrivals 
Servers  =  5 1|@  5  ||0  . . .  5 

n 

5  =  <a,  *>.<5,  /x>.S 

In  order  to  highlight  the  difference  between  the  logic-based  method  and  the 
algebra-based  method  for  assigning  rewards  to  stochastic  process  algebra  terms, 
we  compute  for  the  queueing  system  above  the  mean  number  of  customers  in 
the  system.  Since  every  state  must  be  given  a  reward  equal  to  the  number  of 
customers  in  that  state,  we  proceed  as  follows: 

-  In  the  case  of  System’'^ the  reward  specification  used  in  the  logic- 
based  method  is  {s)ii  =>  raters)! i.e.  every  state  having  an  outgoing 
transition  with  type  s  is  given  a  reward  equal  to  the  rate  of  that  transition 
divided  by  /i.  Using  the  algebra-based  method,  every  action  of  the  form 
<s,h’  must  be  replaced  by  <5,  fM,h>  (and  any  other  action  must  be 
replaced  by  a  triple  with  zero  reward).  Thus,  in  such  a  case  the  two  methods 
are  equally  simple. 

-  In  the  case  of  the  logic-based  method  turns  out  to  be  more 

complex  because  the  modal  logic  formula  must  somehow  count  the  num¬ 
ber  of  possible  consecutive  actions  with  type  s  that  can  be  executed:  as 
a  consequence,  the  rewards  can  be  specified  through  the  set  composed  of 
{s)^{s)ii  1,  {s){s)^{3)ti  ^  2,  . . (3){s)  . . .  {s)^{3)ti  =>  n.  If  we  use 

instead  the  algebra-based  method,  all  we  have  to  do  is  to  replace  every  action 
of  the  form  <5,  fi>  with  <5,  /a,  1>  as  we  assume  that  rewards  are  additive 
(by  analogy  with  rates  . of  exponentially  timed  actions  and  weights  of  imme¬ 
diate  actions),  i.e.  the  reward  gained  by  a  state  is  the  sum  of  the  rewards 
labeling  its  outgoing  transitions.  Therefore,  in  such  a  case  the  ease  of  use 
of  the  algebra-based  method  becomes  evident,  and  it  would  be  even  more 
evident  if  we  considered  e.g.  a  queueing  system  similar  to  the  previous  one 
where  a  FIFO  queue  with  a  given  capacity  is  introduced  in  front  of  the  set 
of  servers:  since  the  delivery  of  a  customer  from  the  queue  to  the  server  has 
to  be  modeled  by  means  of  an  action,  and  since  actions  of  type  5  are  inter¬ 
leaved  with  actions  of  this  kind,  the  formalization  of  modal  logic  formulae 
that  capture  the  number  of  customers  in  the  system  is  really  difficult. 
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To  conclude  this  section,  we  show  that  other  performance  measures  for  the 
queueing  system  above  can  be  easily  specified  with  the  algebra-based  method, 
and  that  this  capability  depends  on  the  style  used  to  represent  the  system: 

-  If  we  want  to  compute  the  throughput  of  the  system,  defined  as  the  mean 
number  of  customers  served  per  time  unit,  we  have  to  take  into  account  the 
rate  of  actions  having  type  5.  As  a  consequence,  in  the  case  of 

we  must  replace  every  action  of  the  form  <5,  h  *  with  <s,  h  ■  '  /i>, 

while  in  the  case  of  we  must  replace  every  action  of  the  form 

<Sjfj.>  with  <s,fjL,^>. 

-  If  we  want  to  compute  the  mean  response  time  of  the  system,  defined  as 

the  mean  time  spent  by  the  customers  in  the  system,  we  can  exploit  Little’s 
law  [7]  which  states  that  the  mean  response  time  of  the  system  is  equal  to 
the  mean  number  of  customers  in  the  system  divided  by  the  customer  arrival 
rate.  Therefore,  in  the  case  of  Tnust  replace  every  action 

of  the  form  <5,  h’^>  with  <5,  h/X>,  while  in  the  case  of  System]^ 

we  must  replace  every  action  of  the  form  <s,^>  with  <5, /i,  1/A>. 

-  If  we  want  to  compute  the  utilization  of  the  system,  defined  as  the  fraction 
of  time  during  which  servers  are  busy,  we  have  to  single  out  states  having 
an  outgoing  transition  labeled  with  s.  Thus,  in  the  case  of 

we  must  replace  every  action  of  the  form  <5,  h  ♦  /x>  with  <5,  h  •  /i,  1>.  We 
observe  that,  unlike  the  logic-based  method,  in  the  case  of  System^^jrjj^fij^^^ 
the  algebra-based  method  cannot  be  used  to  determine  the  utilization  of  the 
system  due  to  the  additivity  assumption:  the  rate  to  associate  with  actions 
of  the  form  <s, /i>  would  be  the  reciprocal  of  the  number  of  transitions 
labeled  with  s  exiting  from  the  same  state.  Since  the  main  objective  of  the 
algebra-based  method  is  its  ease  of  use,  we  prefer  to  keep  the  specification  of 
rewards  as  simple  as  possible,  i.e.  just  by  means  of  numbers:  thus  we  avoid  the 
introduction  of  arithmetical  expressions  as  well  as  particular  functions  such 
as  the  one  determining  the  number  of  transitions  of  a  given  type  exiting  from 
the  same  state.  Incidentally,  the  inability  to  compute  the  utilization  in  the 
case  of  the  resource-oriented  description  should  not  come  as  a  surprise,  since 
this  description  is  more  suited  to  the  determination  of  performance  indices 
concerning  a  single  server  instead  of  the  whole  set  of  servers.  As  it  turns 
out,  it  is  quite  easy  to  measure  the  utilization  of  a  given  server  specified  in 
Sysiem^^jip^l^l^,  whereas  this  is  not  possible  for  System’^^i^jj^i^.  This  means 
that  the  style  [9]  used  to  describe  a  given  system  through  an  algebraic  term  is 
strongly  related  to  the  possibility  of  deriving  certain  performance  measures 
through  the  algebra-based  method. 

3  Syntax  and  Semantics  for  EMPA^ 

In  this  section  we  extend  the  syntax  and  the  semantics  for  EMP.A.  [1,  2]  in  order 
to  cope  with  the  presence  of  rewards  treated  according  to  the  algebra-based 
method  outlined  in  the  previous  section:  the  resulting  stochastic  process  algebra 
is  called  EMPAr- 
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As  usual,  the  building  blocks  of  EMPAr  are  actions.  Each  action  is  a  triple 
<a,  A,r>  con-sisting  of  the  type  of  the  action,  the  rate  of  the  action  and  the  re¬ 
ward  of  the  action:  the  third  component  is  new  with  respect  to  the  structure  of 
EMPA  actions.  Like  in  EMPA,  actions  are  divided  into  external  and  internal  (r) 
according  to  types,  while  they  are  classified  as  exponentially  timed,  immediate 
or  passive  according  to  rates.  Since  exponentially  timed  actions  model  activities 
that  are  relevant  from  the  performance  standpoint,  nonzero  rewards  can  be  as¬ 
signed  only  to  them.  We  denote  by  AType  the  set  of  types,  by  A  Rate.  —  ^^UlnfU 
{*},  with  Inf  =  {oo,,^  \  leJ^+Aw  e  11+},  the  set  of  rates,  by  AReward  =  R 
the  set  of  rewards,  and  by  Actr  —  {<a,  A,  r>  G  AType  x  ARaie  x  AReward  |  A  G 
Inf  U  {*}  =>  r  =  0}  the  set  of  actions.  We  use  a,  fe,  c, . . .  as  metavariables  for 
AType,  A,A,7,---  for  ARaie,  for  R+,  and  r,  r',  r", . . .  for  AReward. 

Finally,  we  denote  by  PLevel  =  {-1}  U  N  the  set  of  priority  levels,  and  we 
assume  that  *  <  A  <  ooi^yj  for  all  A  G  R+  and  ^ 

Let  Const  be  a  set  of  constants,  ranged  over  by  A,  B,C, . . and  let  RFun  = 
{if  :  AType  — ^  AType  \  ip{r)  =  r  A  (p{AType  -  {r})  C  AType  -  {r}}  be  a  set  of 
relabeling  functions. 

Definition  1.  The  set  Cr  of  process  terms  of  EMPA,.  is  generated  by  the  fol¬ 
lowing  syntax 

E  ::=0\  <a,  A,  r>.E  \  E/L  |  E[(p]  \  E-FE\E\\sE\A 
where  L,S  C  AType  -  {r}.  The  set  Cr  will  be  ranged  over  by  E,  F,G,.. ..  We 
denote  by  Qr  the  set  of  guarded  and  closed  terms  of  £r  •  ■ 

We  recall  from  [1,  2]  that  the  alternative  composition  operator  is  parametric 
in  the  nature  of  the  choice:  the  choice  is  solved  according  to  durations  in  the 
case  of  exponentially  timed  actions  (race  policy)  and  according  to  priorities  and 
weights  in  the  case  of  immediate  actions  (preselection  policy),  while  it  is  purely 
nondeterministic  in  the  case  of  passive  actions.  We  also  remind  that,  concerning 
the  parallel  composition  operator,  a  synchronization  can  occur  if  and  only  if  the 
involved  actions  have  the  same  type  belonging  to  the  synchronization  set,  and 
at  most  one  of  the  involved  actions  is  not  passive. 

The  integrated  semantics  of  EMPAr  terms  can  be  defined  by  exploiting  again 
the  idea  of  potential  move:  the  multiset  ^  of  the  potential  moves  of  a  given 
term  is  inductively  computed,  then  those  potential  moves  having  the  highest 
priority  level  are  selected  and  appropriately  merged.  The  formal  definition  is 

based  on  the  transition  relation  - which  is  the  least  subset  of  Qr  x  Actr  x 

Qr  satisfying  the  inference  rule  reported  in  the  first  part  of  Table  1.  This  rule 
selects  the  potential  moves  having  the  highest  priority  level,  and  then  merges 
together  those  having  the  same  action  type,  the  same  priority  level  and  the  same 

^  We  use  “{!"  and  “|}”  as  brackets  for  multisets,  0  to  denote  multiset  union, 
MufiniS)  {T finis))  to  denote  the  collection  of  finite  multisets  (sets)  over  set  S, 
M{s)  to  denote  the  multiplicity  of  element  s  in  multiset  M,  and  Tri{M)  to  denote 
the  multiset  obtained  by  projecting  the  tuples  in  multiset  M  on  their  i-th  component. 
Thus,  e.g.,  {7ri(FM2))(<a,*,  0>)  in  the  fifth  part  of  Table  1  denotes  the  multiplicity 
of  tuples  of  PMo  who.se  first  component  is  <a,  *,0>. 


(<a,  A.  r>,  £')  €  Mclt^[Seleclr(PM r(E))) 


PMAQ.)  =  0 

PM r(<a,  A,  t>.E)  =  {I  (<<i,  A,  r>,  E)  |) 

PMr{E/L)  =  {\(<a,X,T>,E'/L)  1  {<a,X,r>,E')  €  PMr(£)  A  a  ^  1 1}  © 

(I  (<r,  A,  r>,  E'/L)  |  (<a,  A.  r>,  £')  €  PMr{E)  A  a  €  i  1} 

PMr{E[^])  =  {|  (<vj(a).  A,  r>,  E’[-p])  \  (<a,  X,  r>,  E'}  €  PM r(E)fl 

PMr(El  +  £2)  =  PMr(£l)e  PMr(E2) 

PMAEi  ||5£2)  =  {|(<a.  A,r>,£i  ||s£2)  |  a  ^  S  A  (<a,X,r>,  E[)  €  PMr(£i)|}  ® 
{|(<a,A,r>,£i  ||s£2)  |  a  ^  S  A  (<a,  A,  r>,  £^)  6  PMr{Et)\}  ® 

{|(<a,7,r>,£i||s£^)  |a€SA 

(<a,  Ai , ri >,  £{ )  €  PMr(Ei)  A 
{<a,  A2, 7'2>(  £2)  €  PMriE^)  A 
7  =  Normr,rate{a,  Ai ,  A2,  PA/r(£i),  PMAEi))  A 
r  =  Norm^  rewardi.O’’  n,ri,  PMr{Ei),  PMriEi))  1} 

PMAA)  =  PMAE)  \i  A  =  E 


SelectAPM)  =  {|  (<a,  A,  r>,  £)  6  PM  \  Pir(<a.  A,  r>)  =  -1  V 

V(<6,  /t,  r'>,  E')  €  PM.  PLr{<a,  A,  r>)  >  PLr{<b,  /t,  r'>)  |} 

PLr{<o.,*,0>)  = —1  PLr{<a,  X,r>)  =  0  PLr{<a,  oci,u;,0>)  —  I 


MeltriPM)  =  [{<a,X,r>,E)  \  (<a, /t,  r'>,  £)  e  PM  A 

A  =  Min{\  7  I  {<a,  7,  r">,  £:)  G  PM  A  PLr{<a,  7,  r">)  =  /i,  r'>)  [} 

r  =  ^{|r”  I  (<a,7,r">,P)  €  PM  A  Pi:7-(<a,  7,  P'>)  =  PZ;r(<a, /i, />)  |}} 

*  Min  *  =  *  Ai  Min  A2  =  Ai  +  A2  ooi^-wi  Min  ooi  ,ui2  —  till +11^2 
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derivative  term.  The  first  operation  is  carried  out  through  functions  Selecir  : 
MufiniActr  X  Qr)  — ^  Mufin(Acir  X  Qr)  and  PLr  :  Actr  — ^  PLeveL  which  are 
defined  in  the  third  part  of  Table  1.  The  second  operation  is  carried  out  through 
function  Meltr  :  MufiniAcir  x  Qr)  — ►  Vfin{Actr  X  Qr)  and  partial  function 
Min  :  {ARaie  x  ARaie)  — ARaie,  which  are  defined  in  the  fourth  part  of 
Table  i.  Observe  that  function  Melir  sums  the  rewards  of  the  potential  moves 
to  merge:  this  is  consistent  with  the  additivity  assumption  about  rewards. 

The  multiset  PMr{E)  €  MufiniAcir  x  Qr)  of  potential  moves  oi  E  €  Qr  is 
defined  by  structural  induction  in  the  second  part  of  Table  1.  The  normalization 
of  rates  and  rewards  of  potential  moves  resulting  from  the  synchronization  of 
an  action  with  several  independent  or  alternative  passive  actions  is  carried  out 
through  partial  functions  Nornir^rate  •  (AType  x  ARaie  x  ARaie  x  Mufin{Actr  x 
Qr)  X  MufiniAcir  X  Qr))  ARaU  and  Norrur, reward  ■  (AType  x  AReward  x 
AReward  x  Mufin(Acir  x  Qr)  x  Mufin(Actr  x  Qr))  -e-*' AReward,  and  function 
Spin  :  {ARaie  x  R]o,i])  — ^  ARaie,  which  are  defined  in  the  fifth  part  of  Ta¬ 
ble  1.  Observe  that  the  normalization  of  rewards  is  consistent  with  the  additivity 
assumption  about  rewards. 

Definition  2.  The  iniegraied  inierleaving  semaniics  oi  E  E  Qr  is  the  labeled 

transition  system  Xrfi?]  =  {]E,  Acin  - ^Ei^)  where  ]E  is  the  set  of  states 

reachable  from  E,  and  - >e  is  - restricted  to  \E  x  Acir  x  ]E.  ■ 

As  in  [1,  2],  from  the  integrated  semantic  model  it  is  possible  to  obtain  a 
functional  semantic  model  (by  dropping  action  rates  and  rewards)  as  well  as  a 
performance  semantic  model  (basically  by  dropping  action  types  and  by  lifting 
rewards  from  transitions  to  states  according  to  the  additivity  assumption).  Due 
to  lack  of  space,  we  do  not  show  the  related  definitions  here. 

4  A  Notion  of  Equivalence  for  EMPA^ 

In  [1,  2]  we  developed  a  notion  of  equivalence  for  EMPA  called  strong  extended 
Markovian  bisimulation  equivalence  and  denoted  Such  an  equivalence 

was  defined  according  to  the  idea  of  probabilistic  bisimulation  [8]  on  the  inte¬ 
grated  semantic  model,  and  we  proved  that  it  is  necessary  to  define  it  on  the 
integrated  semantic  model  in  order  for  the  congruence  property  to  hold.  For  the 
sake  of  convenience,  we  can  extend  to  EMPAr  since  it  disregards  rewards, 

provided  that  like  in  [1,  2]  we  introduce  a  priority  operator  “0(_)”  and  we  con¬ 
sider  the  language  Cr^Q  generated  by  the  following  syntax 

0  I  <a,  A, r>.E  \  E/L  \  E[(p\  |  B{E)  \  E  E  \  E\\s  E  \  A 
whose  semantic  rules  are  those  in  Table  1  except  that  the  rule  in  the  first  part 
is  replaced  by 

i<a,\,T>,E')  e  Melir(PMr{E)) 

a,X,r 

E - >E^ 

and  the  following  rule  for  the  priority  operator  is  introduced  in  the  second  part 
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PMr{e{E))  =  Selecir{PMr{E)) 

It  is  easily  seen  that  EMPA^  coincides  with  the  set  of  terms  {S{E)  |  E  G  /^r}- 
We  denote  by  Qr,&  the  set  of  guarded  and  closed  terms  of  £r,0- 

One  of  the  advantages  of  the  algebra-based  method,  besides  its  ease  of  use,  is 
the  possibility  of  defining  a  notion  of  equivalence  for  EMPA^  which  relates  terms 
having  the  same  reward,  thus  allowing  for  simplification  without  altering  the 
value  of  the  performance  index  we  are  interested  in.  Exploiting  the  lesson  learnt 
with  ~ emb  j  we  define  this  new  equivalence  on  the  integrated  semantic  model.  For 
simplicity,  one  may  be  tempted  to  relate  strongly  extended-Markovian  bisimilar 
terms  having  the  same  total  reward,  intended  as  the  sum  of  the  rewards  attached 
to  the  actions  it  can  execute.  However,  in  this  way  one  would  fail  both  to  capture 
an  equivalence  preserving  the  performance  measure  at  hand  and  to  obtain  a 
congruence. 

Example  1.  Consider  terms 

A  =  <a,  A,  r>.<b,  /i,  ri>.A 
B  =  <a,X,r>.<h,  fi,r2>.B 

where  ri  ^  r2.  Then  A  ^emb  B  and  A  and  B  have  the  same  total  reward  r,  but 
if  we  solve  the  two  underlying  performance  models  we  obtain  two  different  values 
of  the  performance  measure  we  are  interested  in:  r  •  /i/(A  -}-  /i)  +  ri  •  A/(A  -h  p.) 

and  r  • /z/(A -f /x)  +  7*2  •  A/(A -h  m)-  ■ 

Example  2.  Consider  terms 

£*1  =  <a,  A,  ri>.0  +  <h,  /x,  r2>.0 
£2  =  <a,  A,  r2>.0  +  <6,  /X,  ri>.0 

where  ri  r 2.  Then  £1  '^emb  ^2  and  £1  and  £2  have  the  same  total  reward 
ri  -1-  ro.  but  e.g.  £1  ||{j}  0  has  total  reward  ri  while  £2  ||{6}  0  has  total  reward 
r2.  ■ 

The  examples  above  show  that  if  we  want  to  preserve  the  performance  measure 
and  to  obtain  a  congruence,  we  cannot  treat  rewards  separately  from  the  rest 
of  the  actions:  rewards  must  be  checked  in  the  bisimilarity  clause  in  order  to 
guarantee  that,  given  two  equivalent  terms,  they  have  the  same  total  reward  and 
any  pair  of  equivalent  terms  reachable  from  them  have  the  same  total  reward. 

Below  we  show  that  it  is  really  easy  to  extend  the  definition  of  ^emb  in  such 
a  way  that  both  objectives  are  achieved.  Proofs  of  results  are  omitted  whenever 
they  are  smooth  adaptations  of  the  corresponding  proofs  in  [2]. 

Definitions.  We  define  partial  functions  Rate^  Reward,  RR  with  domain  Qr,&  x 
AType  x  PLevel  x  V{Qr,0)  and  ranges  ARate,  AReward,  ARaie  x  AReward  re¬ 
spectively,  by 

Rate{E,a,l,C)  =  Mm^X  \  E  E'  A  PLri<a,X,r>)  =  I  a  E'  e  C  \i 

Reward{E,  a,l,C)  =  \  E  £'  A  PLri<a,  A,  r>)  =  /  A  £'  G  C  [} 

RR{E,  a,  C)  =  ( Rate(Ej  a,  C),  Reward(E^  a,  C))  ■ 
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Definition 4.  An  equivalence  relation  B  C  Qr,0  x  Qr,e  is  a  strong  extended 
Markovian  reward  bisimulation  (strong  EMRB)  iff,  whenever  {EifEn)  G  B^  then 
for  all  a  G  AType,  I  G  PLevel  and  C  G  Qr,®!^ 

KR{Eua,  I,  C)  =  RR{E2,  a,  I,  C) 

In  this  case  we  say  that  Ei  and  are  strongly  extended- Markovian  reward 
bisimilar  (strongly  EMRB).  * 

Proposition  5.  Let  ^emrb  be  the  xuiion  of  all  the  strong  EMRBs.  Then  '^emrb 
is  the  largest  strong  EMRB.  ® 

Definition  6.  We  call  '^emrb  the  strong  extended  Markovian  reward  bisimula- 
tion  equivalence  (strong  EMRBE)^  and  we  say  that  Ei,E2  G  Gr,®  strongly 
extended- Markovian  reward  bisimulation  equivalent  (strongly  EMRBE)  if  and 
only  if  El  '^emrb  ■£'2-  * 

Proposition  7.  '^emrb  C  ^emb  • 

Proof.  It  follows  immediately  from  the  fact  that  every  strong  EMRB  is  a  strong 
EMB  too.  ■ 

The  following  example  shows  that  the  inclusion  is  strict.  We  would  like  to  point 
out  that  this  is  not  inconsistent  with  '^emb-  The  purpose  of  '^emb  is  to  relate 
terms  describing  concurrent  systems  having  the  same  functional  and  performance 
properties:  if  Ei  '^bmb  ^2  but  Ei  i'EMRB  ^^2,  this  simply  means  that  we  are 
measuring  two  different  performance  indices  for  Ex  and  E2. 

Example  3.  Consider  terms 

A  =  <a,  A,  l>.<fe,  0>.A 

B  =  <a,X,0>,<bj 

Then  A  '^emb  B  but  A  'f'EMRB  regard  a  and  b  as  the  transmission 

over  two  different  channels,  then  by  means  of  A  we  can  compute  the  utilization 
of  the  former  channel,  whereas  by  means  of  B  we  can  compute  the  utilization  of 
the  latter  channel.  ■ 

Theorems.  Let  Ei,E2  G  Gr,&-  U  ^emrb  ^2  then: 

1.  For  every  <a,\r>  E.  Actr ,  <a,  A,  r>.£?i  <a,  A,r>.£?2* 

2.  For  every  L  C  AType  —  {r},  Ex/L  ^emrb  E2IL. 

3.  For  every  (p  G  RFun,  Ex[p]  '^emrb  ^2[v^]* 

4.  B{Ei)  ^emrb  <^^(>^2)* 

5.  For  every  F  G  Gr,®)  Ex  F  F  ^emrb  E2  F  and  F  +  Ex  '^emrb  F  F  E2. 

6.  For  every  F  G  Gr,®  S  C  AType  —  {t},  \\s  E  '^EMRB  E2\\s  F  <^'nd 

F  ||s  Ex  -»emrb  E2.  ■ 

Theorem  9.  ^emrb  preserved  by  recursive  definitions.  ■ 

Theorem  10.  Let  Ar  be  the  set  of  axioms  in  Table  2.  The  deductive  system 
Ded{Ar)  IS  sound  and  complete  with  respect  to  ^emrb  for  the  set  of  nonrecursive 
terms  ofGr,®-  * 


{ArA  }  {El  E2)  -h  Ez  =  El  {E2  +  Ez) 

iAr.2)  El  E2  —  E2  +  El 
{Ar,3)  E  +  g=E 

[Ata  )  -^1  -  '>-E  +  <«.  ^2,  T2>.E  =  <a,  Ai  Min  Aa,  +  T2>.E 

if  PLr{<a,  Ai  ,  n  >)  =  PLr{<a,  A2,  T2>) 


(Ar,,)  0/1  =  0 

(Ar,6)  {<a,X,T>.E)/L=^ 


<a,X,r>.{E/L)  ila  ^  L 
<r,X,r>.{E/L)  if  a  6 


{Arj)  iEi+E2)IL=:Ei/L  +  E2!L 


Mr.s)  ^<P]=7  0 

(-4r.9)  {<a,X,r>.E)[ip]  =  <(^(a),  A,  r>.(f;[c^]) 
(^r.io)  (£”1  +  E2)[^]  —  El[(f>]  +  £2[<r’] 


Mr.n}6)(0)  =  0 

{ArA2)  Ti>.Ei)  —  ^  <Ctj,  A;,  Tj>.0{Ej) 

i£l  jej 

where  J  —  {i^I\Xi  —  *  V  \/h  E.  I-  PLr{<.0'i,  Ai,ri>)  >  PLr{<(ih,  Xh,'rh>)] 


{ArA3)0\\s0  =  0 

Mm4)  (E<“-’^'>''>>-^')lk0  =  J2  ilsO)  where  J  =  {i  €  /  I  a,  ^  S} 

.£/  J€J 

(>lr,i5)  0||s(I]<a„A,,ri>.£'.)  =  <aj,  A,, r,>.(0  ||s  £,)  where7=  {i  €  /  I  a.  ^  5} 

tei  ^  jeJ 

(^r.ie)  (  Y1  <at.  l|s(  Y1  A,,  ri>.Ei)  = 

<aj,Xj,Tj>.{Ej\\s  Y  <fl..  A,,rr>.£0  + 

j€Ji  ^  «e/2^ 

Y  <aj,  Aj  ,rj  >.(  <a,,  A,,  r,>.£,  |[s  £2)  + 

JEJ2  _ 

Y  <«fc)  Split{Xk,  1/nfc).  rklnk>-{Ek  jjs  Eh)  + 

^  <ak,  Split{Xk,  l/nk).  rk/nk>.{Eh  Us  £fc) 

feeA'2  Ah€/ffc 

where  Ji  =  {z  6  /i  |  flt  ^ 

/2  =  {<■  eh\a,iS} 

£  1  =  {*1  €  -fi  I  3z2  €  l2-  =  ®t2  G  5  A  A, 2  =  *} 

£2  =  {*2  €  -^2  I  3zi  G  flti  —  ttio  G  5  A  Aij  =  *} 

=  {/i  G  /2  I  flfc  =  A  Ah  =  *}  with  k  G  £*i 

//fc  =  {/i  e  /i  I  flfc  =  flh  A  Ah  =  +}  with  k  G  £'2 

n,  =  i£hl 


Table  2.  Axioms  for  '^emrb 
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5  Conclusion 

In  this  paper  we  have  introduced  an  algebra-based  method  to  attach  rewards 
with  EMPA  terms  in  order  to  derive  performance  measures.  As  observed  in 
Sect.  2,  though  less  powerful  in  general  than  the  logic-based  method  proposed 
in  [3],  the  algebra-based  method  may  be  convenient  due  to  its  ease  of  use,  its 
low  computational  cost  and  the  possibility  of  defining  a  notion  of  equivalence 
accounting  for  rewards.  Furthermore,  it  has  been  a  really  easy  task  to  extend 
the  theory  developed  for  EMPA  in  order  to  take  into  account  rewards  according 
to  the  algebra-based  method. 

Concerning  future  work,  we  could  allow  the  designer  to  associate  rewards 
with  immediate  actions  as  well,  because  in  this  way  we  could  derive  performance 
measures  also  when  we  restrict  ourselves  to  the  probabilistic  kernel  [2]  of  EMPA. 
Finally,  the  algebra-based  method  will  be  implemented  in  a  software  tool  (we  are 
currently  developing)  based  on  EMPA  for  the  modeling  and  analysis  of  functional 
and  performance  properties  of  concurrent  systems. 
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Abstract 

In  this  paper  we  present  two  actor  languages  and  a  semantics  preserving  translation  between  them.  The  source  of 
the  translation  is  a  high-level  language  that  provides  object-based  programming  abstractions.  The  target  is  a  simple 
functional  language  extended  with  basic  primitives  for  actor  computation.  The  semantics  preserved  is  the  interaction 
semantics  of  actor  systems — sets  of  possible  interactions  of  a  system  with  its  environment.  The  proof  itself  is  of 
interest  since  it  demonstrates  a  methodology  based  on  the  actor  theory  framework  for  reasoning  about  correctness  of 
transformations  and  translations  of  actor  programs  and  languages  and  more  generally  of  concurrent  object  languages. 


1  Introduction 

In  this  paper  we  continue  our  investigation  of  the  actor  model  of  computation  [Hew77,  Agh86,  Agh90,  AMST97, 
TaI96b,  Tal96a].  Actors  are  independent  computational  agents  that  interact  solely  via  asynchronous  message  passing. 
An  actor  can  create  other  actors;  send  messages;  and  modify  its  own  local  state.  An  actor  can  only  effect  the  local  state 
of  other  actors  by  sending  them  messages,  and  it  can  only  send  messages  to  its  acquaintances  -  addresses  of  actors  it 
was  given  upon  creation,  it  received  in  a  message,  or  that  it  created.  Actor  semantics  requires  computations  to  be  fair. 

We  take  two  views  of  actors:  as  individuals  and  as  elements  of  components.  Individual  actors  provide  units  of 
encapsulation  and  integrity.  Components  are  collections  of  actors  (and  messages)  provided  with  an  interface  spec¬ 
ifying  the  receptionists  (actors  accessible  from  outside  the  component)  and  external  actors  (accessible  from  but  not 
existing  inside  the  component).  Collecting  actors  into  components  provides  for  composability  and  coordination.  Indi¬ 
vidual  actors  are  described  in  terms  of  local  transitions.  Components  are  described  in  terms  of  interactions  with  their 
environment. 

The  actor  model  provides  a  natural  framework  for  inter-operation  of  multiple  languages  since  the  details  of  the 
code  describing  an  individual  actors  behavior  are  not  visible  outside  that  actor.  All  that  needs  to  be  common  is  the 
messages  communicated  among  the  different  actors.  In  [Tal96b],  this  intuition  is  formalized  using  the  notion  of  an 
abstract  actor  structure.  Here  we  generalize  the  notion  of  an  abstract  actor  structure  to  an  actor  theory.  Actor  theories 
provide  a  general  semantic  framework  for  specifying  and  reasoning  about  actor  systems  as  well  as  for  reasoning  about 
relations  between  different  actor  languages.  An  actor  theory  plays  the  role  of  a  theory  that  axiomatizes  the  behavior 
of  individual  actors.  The  models  of  an  actor  theory  account  directly  for  the  interaction  (exchange  of  messages)  of 
a  actor  component  with  its  environment.  Each  model  of  an  actor  theory  gives  rise  to  a  corresponding  semantics  of 
actor  components.  Two  important  models  are:  computation  paths — analogous  to  labelled  transition  system  semantics; 
and  interaction  paths — obtained  from  computation  paths  by  omitting  details  of  internal  computation.  These  give  rise 
to  computation  path  and  interaction  semantics,  respectively.  Both  semantics  are  composable  and  as  we  will  see, 
interaction  semantics  is  largely  insensitive  to  the  particular  choice  of  actor  language. 

In  this  paper  we  illustrate  the  ideas  and  techniques  based  on  actor  theories  by  showing  how  they  can  be  used 
to  establish  the  correctness  of  a  translation  from  a  high-level  actor  language  to  low  level  actor  language  such  as 
might  be  found  in  compiler  preprocessor.  The  low-level  kernel  language,  is  an  extension  of  a  simple  functional 
language  based  on  the  call-by-value  A-calculus  with  primitives  for  actor  computation.  The  high-level  user  language, 
provides  object-based  programming  abstractions.  Each  of  the  languages  is  given  a  semantics  by  defining  a 
corresponding  actor  theory.  We  give  a  separate  semantics  for  the  user  language  in  order  to  be  able  to  reason  directly 
about  user  programs.  The  correctness  theorem  shows  that  we  can  also  reason  about  user  programs  by  translating  to  the 
kernel  language  and  reasoning  in  terms  of  the  kernel  semantics.  The  translation,  v.2ky  from  the  user  language  to  the 
kernel  language  eliminates  the  object-based  programming  abstractions  in  favor  of  the  simple  actor  primitives.  The 
main  result  presented  there  is  that  the  translation,  v.2k,  preserves  the  interaction  semantics. 

Theorem  (user-to-kemel):  hem{'^P)  =  }sem{u2k{''P))\'^  where  “P  is  a  user  language  program,  Isem  maps 
programs  to  their  interaction  semantics,  and  [‘M  restricts  the  kernel  interactions  to  user  language  messages. 

The  proof  that  the  translation  preserves  interaction  semantics  itself  is  of  interest  since  it  demonstrates  a  methodol¬ 
ogy  for  proving  correctness  of  transformations  and  translations  of  actor  languages  and  more  generally  of  concurrent 
object  languages.  For  the  proof  we  lift  the  translation  to  semantic  configurations  that  correspond  to  the  possible  actor 
.sy.stem  states  and  show  that  the  following  diagram  commutes 

up  kp 

f-U  -i  I-I 

up-  kp' 

where  Pisa  top-level  program,  K,  is  a  configuration,  and  {_]  gives  the  semantics  of  a  program  in  terms  of  the  initial 
configuration  that  it  describes.  (We  use  the  following  convention:  if  X  is  some  entity,  then  we  use  the  super-prescript 
“A'  to  indicate  that  A'  belongs  to  the  user  language  and  to  indicate  that  X  belongs  to  the  kernel  language.  So 
for  example  “/f  is  an  user  language  configuration.)  The  proof  is  completed  by  showing  that  interaction  semantics  is 
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preserved  by  translation  at  the  semantic  level  Isem{'^K)  =  Isem{u2k{'^K))\'^.  This  proof  involves  establishing  a 
correspondence  between  the  (possibly  infinite)  computations  of  two  systems.  The  actor  theories  defined  for  each  of  the 
languages  correspond  to  standard  transition  system  semantics  with  transitions  that  are  small  and  easy  to  understand,  but 
expose  much  irrelevant  detail.  We  make  use  of  a  general  interaction  semantics  preserving  actor  theory  transformation 
that  can  be  thought  of  as  moving  from  a  small  step  a  big  step  operational  semantics.  Changing  the  level  of  abstraction 
of  the  operational  semantics  of  a  fixed  language  is  a  general  technique  useful  for  reasoning  about  systems  at  the 
desired  level  of  detail.  Reasoning  about  the  level  changing  transformation  on  actor  theories  and  the  language  changing 
translation  is  simplified  by  using  ideas  from  the  rewriting  logic  model  of  concurrent  computation  [Mes92,  Ta!96a]  to 
define  notions  of  computation  path  equivalence. 

Notation:  We  use  the  usual  notation  for  sets,  functions,  finite  sequences,  etc.  Let  y  be  a  set.  We  specify  meta¬ 

variable  conventions  in  the  form:  let  y  range  over  Y,  which  should  be  read  as:  the  meta-variable  y  and  decorated 
variants  such  as  y' ,  t/O'  •  •  •  >  range  over  the  set  Y .  is  the  set  of  (finite)  multi-sets  with  elements  in  Y ,  0  is  the 

empty  multiset  and  if  A'l  and  X2  are  multisets,  then  Xq  ,  Xi  is  the  multiset  union  of  the  two. 


2  A  Semantic  Framework  for  Actors 


In  this  section  we  introduce  actor  theories  as  a  general  semantic  framework  for  actor  computation.  The  notion  of 
actor  theory  provides  an  axiomatic  characterization  of  actor  languages:  the  basic  features,  capabilities,  and  con¬ 
straints.  Actor  theories  can  be  considered  as  an  operational  alternative  to  the  domain  theoretic  behaviors  used  by 
Clinger  [CliSl].  Actor  theories  are  a  simplification  and  generalization  of  the  notion  of  abstract  actor  structures  pre¬ 
sented  in  [Tal96b,  Tal96a]. 

An  actor  theory  describes  individual  actor  behaviors  and  their  local  interactions  in  a  representation  independent 
manner.  An  actor  theory  specifies  sets  of  actor  names,  actor  states,  message  contents,  and  labelled  reaction  rules. 
Actor  names  are  the  means  of  uniquely  identifying  individual  actors.  Actor  states  are  intended  to  carry  information 
traditionally  contained  in  the  script  (methods)  and  acquaintances  (values  of  instance  variables),  as  well  as  the  local 
message  queue  and  the  current  processing  state.  Message  contents  represent  the  information  that  can  be  communicated 
between  actors,  both  locally  and  as  interactions  with  the  environment.  Reaction  rules  determine  what  an  actor  in  a 
given  state  can  do  next  and  how  it  will  respond  to  messages  with  given  contents.  More  generally  reaction  rules 
describe  synchronous  interactions  of  groups  of  actors  and  messages.  Reaction  rules  ^e  labelled.  These  labels  are  us^ 
in  deriving  a  labelled  transition  system  semantics.  In  this  way  the  labels  provide  information  concerning  the  basic 
observations  that  can  be  made  as  an  actor  system  evolves.  An  actor  theory  must  obey  the  fundamental  acquaintance 
(locality)  laws  of  actors  [BH77,  CliSl]  in  addition  to  renaming  laws  that  express  the  fact  that  computation  is  uniformly 
parameterized  in  the  choice  of  actor  names — renaming  commutes  with  everything.  To  state  these  laws  an  actor  theory 
also  provides  a  primitive  operation  to  determine  the  acquaintances  of  (actor  names  occurring  in)  the  various  entities 
and  a  primitive  Operation  to  rename  them. 

The  operational  semantics  of  an  actor  theory  is  given  by  the  transition  relation  on  configurations  derived  from  the 
reaction  rules.  A  configuration  can  be  thought  of  as  representing  a  global  snapshot  of  an  actor  system  with  respect 
to  some  idealized  observer  [Agh86].  It  contains  a  set  of  receptionist  names,  a  set  of  external  actor  names,  and  a 
collection  of  actors  and  messages.  The  sets  of  receptionist  names  and  external  actor  names  are  the  interface  of  an 
actor  configuration  to  its  environment.  They  specify  what  internal  actors  are  visible  from  the  environment,  and  what 
actor  connections  must  be  provided  for  the  configuration  to  function.  Both  the  set  of  receptionist  names  and  the  set  of 
external  actor  names  may  grow  as  the  configuration  evolves.  The  collection  of  actors  and  messages  is  the  interior  of 
the  configuration.  It  specifies  the  internal  actors  and  their  current  states,  and  the  state  of  the  internal  message  system. 
Configurations  evolve  either  by  internal  computation  or  by  interaction  with  the  environment.  The  transition  relation 
expresses  the  ways  a  configuration  might  evolve  and  interact  with  its  environment.  The  computation  path  semantics 
of  a  configuration  is  the  set  of  fair  computations  possible  starting  with  that  configuration.  Interaction  semantics 
a  more  abstract  view  of  an  actor  system,  specifying  only  the  possible  interactions  (patterns  of  message  passing)  a 
svstem  can  have  with  its  environment.  Interaction  semantics  is  the  result  of  hiding  all  information  concerning  the 


interna!  computations  and  what  actors  may  be  present  beyond  the  receptionists. 

The  term  reaction  rule  is  used  here  in  the  same  spirit  as  in  the  Chemical  Abstract  Machine  [BB92)  to  indicate  local 
interactions  of  reactive  entities.  Actors  and  messages  can  be  thought  of  as  special  kinds  of  molecules  and  interiors  are 
like  solutions.  Actor  theories  arc  in  fact  a  special  case  of  rewrite  theories  and  we  the  mechanisms  we  use  to  derive  the 
compulations  of  a  actor  system  are  based  on  those  of  rewriting  logic  [Mes92L  -vodvactv/tt 

An  actor  theory  is  a  structure  AT  of  the  following  form:  AT  —  (  (A,S,M,L),  {acq,  RR  ).  A,  S,  M,  L 
are  the  primitive  sorts  of  AT.  A  is  a  countable  set  of  actor  names,  S  is  a  set  of  actor  states,  M  is  a  set  of  message 
contents  and  L  is  a  set  of  labels.  From  the  primitive  sorts  we  form  actor  entities  (briefly  actors),  AE,  messages 
Mse.  and  configuration  interiors,  I.  We  let  a  range  over  A,  M  range  over  M,  s  range  over  S,  I  range  over  L,  and 
I  ranfe  over  I  t  s  ]  „  is  an  actor  with  name,  a,  in  state,  s  and  a  <  Af  is  a  message  with  addressee,  a,  and  contents,  M . 
A  confiauration  interior,  I,  is  a  multiset  of  actors  and  messages  in  which  no  two  actor  enUties  have  the  same  name 
RR%  a  set  of  reaction  rules  that  specify  the  behavior  of  individual  actors  and  their  synchronization  with  other 
internal  actors  and  messages.  Elements  of  RR  are  triples  of  the  form  I  ■.  I  ^  V  where  I  is  the  rule  label,  I  is  rule 


.source,  and  7' is  the  rule  target.  r  a  i 

The  primitive  operations  of  AT  are:  acq  and  . .  The  acquaintance  function,  acq  •  S  ^  ^  ^ 

(finite)  set  of  actor  names  occurring  in  a  state,  message  contents,  or  label,  acq  extends  homomorphically  to  structures 
built  from  the  primitive  sorts.  Actor  addresses  cannot  be  explicitly  created  by  actors,  and  the  semantics  cannot  depend 
on  the  particular  choice  of  addresses  of  a  group  of  actors.  A  renaming  mechanism  is  used  to  fomulate  this  requmement. 
We  let  Bii(A)  be  the  set  of  biiections  on  A  (renamings)  and  let  a  range  over  Bij(A).  For  any  such  a,  q  is  the 
associated  renaming  function  on  states,  message  contents,  and  labels.  Renaming  is  extended  naturally  to  structures 
built  from  addresses,  states,  and  values.  For  example  S(  [s]  a)  =  [o:(s)]a(a)-  Renaming,  a,  commutes  with  the 
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acquaintance  function  and  is  determined  by  the  restriction  of  a  to  the  acquaintances  of  an  object.  It  is  a  bijection  on 
A  U  M  U  L.  To  state  the  axioms  for  reaction  rules,  we  define  two  auxiliary  functions:  InAct,  ExtAct  :  I  — ^  T’u-fA], 
lnAct{I)  is  the  set  of  names  of  actors  that  occur  in  I,  and  ExtAct{I)  is  the  set  of  names  of  external  actors  referred  to 
in  1:  InAct{I)  =  {o  6  A  |  (3s  €  S)(  [s]  „  €  /)}  ExtAct(I)  =  acq{I)  -  InAct{I). 

Axioms  for  Reaction  rules  (RR)  If  /:/=»/'  €  RR,  then 

(i)  InAct{J)  0 

(ii)  I :  lo  =>  1q  &  RR  implies  InAct{J)  =  JnAct{Io)  and  InAct{I')  ~  /nAct(/o) 

(iii)  JnAct{I)  C  lnAct{l')  C  acq{l) 

(iv)  ExtActil')  C  ExtAct{I) 

(v)  a(l)  :  a(I)  =>  q(/')  €  RR  for  any  renaming  a  in  Bij(A) 

(i)  states  that  reactions  must  involve  at  least  one  existing  actor;  (ii)  states  that  a  label  uniquely  determines  the  actors 
involve  in  a  reaction;  (iii)  stales  that  actors  cannot  disapp)ear  and  that  the  actors  involved  in  a  reaction  must  be  made 
explicit  as  acquaintances  of  the  reaction  label;  (iv)  states  Aat  no  references  to  external  actors  are  acquired  in  an  internal 
transition,  although  some  may  be  forgotten;  and  (v)  states  that  the  set  of  rules  is  closed  under  renaming. 

If  Z  ;/=>■/'  €  RR,  we  call  InAct{I)  the  old  actors  of  I  and  InAct{r)  -  InAct{I)  the  new  actors  of  /. 

An  actor  configuration  is  a  configuration  interior,  7,  together  with  two  sets  of  actor  names:  the  receptionists  p  , 
which  are  a  subset  of  the  internal  actors  of  the  interior;  and  the  externals  x  which  include  all  actors  mentioned  in  the 
interior  that  are  not  internal  actors. 

Definition  (Configurations,  K):  K  =  7  ^  |  p  C  InAct{I)  A  ExtAct(I)  C  x).  We  let  K  range  over  K. 

The  computations  of  a  configuration  are  given  by  the  labelled  transition  relation:  K  -U  K'.  K  is  the  source  of 
the  transition  and  K'  is  the  target  and  I  is  the  label.  Transition  labels  are  either  rule  labels,  input/output  labels,  or  a 
special  idle  label,  idle.  An  input  label  has  the  form  in(a<M),  indicating  a  message  coming  in  from  the  environment. 
An  output  label  has  the  form  out(a  <  M)  indicating  a  message  transmitted  to  the  environment.  We  now  let  the  range 
of  I  include  these  additional  transition  labels. 

Definition  (Transition  rules): 


(internal) 

(/„,/)):  A  (A. 

(in) 

(out) 

(idle) 

lY  ifl:Io=>/ieRR 


M 


y 

H  xU(oC9(M)-p) 


(7 


^pU(ocg(M)- 


X') 


if  o  €  p  A  acq{M)  fl  InAct{I)  C  p 
if  fl  ^  lnAct{J) 


In  (internal)  we  assume  that  the  configurations  are  well-formed  -  InAct{Ii)  D  lnAct{J)  =  0,  p  C  InAct{Jo)  U 
7nAc£(7),  and  £’3:tAcf(7o,7)  C  X- 

The  computation  paths  of  a  configuration,  V{K)  are  the  computation  paths  whose  initial  configuration  is  K. 
Definition  (Computation  Paths,  V,  V{K)):  V  is  the  set  of  sequences,  tt,  of  the  form 

K+i  I  i  €  N]  V{K)  =  {7:  eV  \  K  is  the  source  of  7r(0)} 

A  finite  computation  is  a  path  in  which  all  but  a  finite  number  of  the  transition  labels  are  idle.  Recall  that  actor 
computations  are  required  to  be  fair.  Thus  we  do  not  want  to  consider  arbitrary  paths,  only  the  fair  ones.  A  computation 
is  fair  if  whenever  a  transition  is  enabled,  either  it  eventually  fires  or  it  becomes  permanently  disabled.  We  only 
consider  enabledness  for  transitions  whose  label  is  a  reaction  rule  label  or  an  output  label.  We  can  not  force  the 
environment  to  do  an  input  and  the  idle  transitions  are  simply  ignored  for  the  purpose  of  fairness.  T{K)  is  the  fair 
paths  for  . 

In  analogy  to  thinking  of  a  sequential  procedure  as  a  black  box  characterized  by  its  input/output  relation,  we  would 
like  to  think  of  an  actor  system  as  a  black  box  characterized  by  the  set  of  possible  interactions  with  its  environment. 
Thus  we  define  the  interaction  semantics  of  an  actor  system  in  such  a  way  as  to  hide  the  details  of  internal  transitions. 
The  interaction  semantics  of  a  configuration  is  its  set  of  possible  interaction  paths.  An  interaction  path  of  a  configu¬ 
ration  is  an  infinite  sequence  of  interaction  labels  together  with  an  initial  interface  consisting  of  a  pair  of  finite  sets  of 
actor  names  (the  receptionists  and  externals).  An  interaction  label  is  either  an  input/output  label  or  the  special  sign, 
T*,  standing  for  possible  internal  activity.  The  infinite  sequence  of  interaction  labels  in  an  interaction  path  is  obtained 
from  a  computation  path  by  mapping  internal  transitions  to  silent  transitions. 

The  function  isem  maps  transition  labels  to  interaction  labels  and  computation  paths  to  interaction  paths.  The 
receptionists  and  externals  of  iseTn(7r)  are  those  of  the  initial  configuration  of  tt.  The  interaction  sequence  of  isem(7r) 
is  the  sequence  of  labels  obtained  by  replacing  internal  and  idle  transition  labels  to  r*. 

Definition  (tsem(7r)  IseTn{K)): 

iflGLU{idle} 

iseTn[  }  I  ^  ^  in(Msg)  U  out(Msg) 

isem{TT)  =  where  7r(i)  =  ^7i  ^  ^^i+i  ^  and  t?(0  =  iseTn{li)  for  i  6  N 

}sem{K)  =  {tsem(7r)  j  tt  G  J^{K)] 
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So  far  we  have  been  working  in  the  context  of  a  fixed,  but  arbitrary  actor  theory.  In  the  case  that  we  consider  inter¬ 
action  semantics  in  more  than  one  actor  theory,  we  index  Isem  by  the  name  of  the  actor  theory,  writing  IsemArif^)- 
It  is  sometimes  convenient  to  restrict  the  interactions  of  a  configuration  with  its  environment  by  restricting  the  possible 
set  of  input  messages.  For  V  C  M,  we  define  Isem{K)  \V  to  be  set  of  interaction  paths  (  6  Isem{K)  whose  input 
labels  are  messages  with  contents  in  V. 

Definition  {Jsem{K)\V): 

Isem{K)\V  =  {zsem(7r)  |  tt  €  ^(K)  A  (Vi  6  N,  a  €  A,  M  €  M)(Tr(i)  =  in(o  <  M)  M  €  V)} 

For  a  given  actor  language,  we  usually  define  the  reaction  rules  for  an  actor  theory  by  giving  the  semantics  in  terms 
of  basic  reduction  steps  for  expressions  of  the  language.  We  call  this  a  small  step  actor  theory.  It  is  simple  to  define,  but 
gives  rise  to  computations  with  many  small  and  mostly  uninteresting  steps.  In  the  following  we  show  how  to  transform 
such  an  actor  theor>'  in  to  big  step  actor  theory  which  preserves  the  interaction  semantics  of  the  language.  In  the  big 
step  theory'  internal  computation  steps  are  those  that  create  actors,  send  messages,  or  involve  some  synchronization  of 
actors  and  messages,  thus  suppressing  further  details  of  internal  computation  of  an  actor. 

The  key  ideas  motivating  the  transformation  are  the  notions  of  silent  step  and  that  of  a  path  being  in  big-step 
form.  A  silent  step  is  one  involving  a  single  actor  that  creates  no  new  messages  or  actors.  A  path  in  big  step 
form  consists  of  input/output  transistions  and  non-silent  steps  each  preceded  by  the  necessary  silent  steps  to  pre¬ 
pare  the  reacting  actors.  For  AT  =  (  (A,  S,M, L),  {acq,"l),RR  )  we  define  its  big-step  variant  AT^  by  AT^  = 
)  where 

RR^  =  {/I  :  7  =i>  7'  I  (37'')(7  I"  A  I :  I”  =>  I'  e  RR  a  non-silent  rule} 

and  7  7"  is  sequence  of  silent  steps.  The  crucial  property  of  the  big  step  operation  is  that  it  preserves  interaction 

semantics.  Let  AT  be  an  actor  theory  and  let  7f  be  a  configuration  of  AT.  Then 
Theorem  (sman2big):  IseviATiK)  =  {K) 

The  proof  relies  on  the  ability  to  put  paths  into  big-step  form. 


3  The  Kernel  Language 

We  assume  given  an  infinite  set  of  variables,  X.  We  also  assume  as  given  a  collection  of  basic  or  atomic  data.  At,  that 
includes  the%ooleans  t,  f  €  Bool,  Scheme  style  symbols,  Sym,  (Sym  includes  nil,  the  empty  or  null  list),  (con¬ 
stants  denoting  the  elements  of)  the  integers,  Z,  and  actor  names,  A.  Expressions  are  built  from  atoms  and  variables  by 
the  following  operations:  A-abstraction,  application  of  primitive  operations  to  sequences  of  expressions,  conditional 
branching,  and  an  actor  creation  construct.  The  primitive  operations  include  operations  on  basic  data  and  pairs,  and 
kernel  primitives  manipulating  actors,  procedures,  and  local  continuations.  The  data  operations  dOp  contains  the 
recognizers:  boolean?  for  booleans,  symbol?  for  symbols,  integer?  for  integers,  cons?  for  pairs,  and  actor?  for 
actors  (all  of  arity  1);  pairing  cons,  car,  edr  (arities  2,  1,  1);  the  equality  predicate,  equal?,  on  atomic  data;  and  the 
usual  arithmetic  operations,  aOp.  We  consider  actor  addresses  to  be  atomic  data  and  consequently  can  tell  one  address 
from  another.  The  functional  specific  primitives  are  procedure?,  the  recognizer  for  procedures  (arity  1 ),  app,  lambda 
application  (arity  2),  and  clc,  control  abstraction  (arity  1).  We  include  app  in  the  list  of  primitive  operations  as  a  tech¬ 
nical  convenience,  to  make  the  syntax  more  concise.  The  actor  primitives  consists  of  an  actor  creation  construct  plus 
the  operations:  self  (of  arity  0),  the  name  of  the  executing  actor;  send,  asynchronous  send  (arity  2);  ready,  establish¬ 
ing  behavior  for  receiving  (arity  1).  Actor  creation  expressions  are  of  the  form  letactor{a;o  :=  «0,  •  :=  Cfc)  e 

where  the  Xi  are  pairwise  distinct  variables.  Executing  a  letactor  expression  creates  a  new  actor  entity  Oi  for  each 
X,  executing  expressions  Ci  with  Xi  bound  to  Oj.  The  original  executing  actor  then  proceeds  by  executing  e  (with  Xi 

bound  to  a,).  .  .  t-  •  i  i 

The  top  level  syntactic  construct  is  a  kernel  program  which  describes  a  configuration.  For  convenience,  kernel 

programs  may  include  a  library  of  mutually  recursive  definitions.  For  this  purpose  we  reserve  a  subset  ‘T\inld  of  X 
to  be  used  as  function  names. 

Definition  (Kernel  Programs): 

=  At  U  cons('l^,  ‘‘M)  ''Program  =  program(receptionists  :  T’wfA],  externals  :  T^[A] 

libraury  :  Vu  [Funid  :=  AX.*®] 
actors  :  'Pw[A  :=  'E] 
messages  :  [A  <  'M]) 

where  the  function  identifiers  in  the  library  part  and  actor  names  in  the  actors  part  must  be  distinct,  and  all  actor 
names  occurring  in  an  actor  state  or  message  contents  must  either  be  one  of  the  actor  names  defined  in  the  actors 
part  or  one  of  the  names  occurring  in  the  externals  part.  Message  contents  are  simply  values  built  up  from  the 
atomic  data  via  the  pairing  operation  cons.  Lambda  abstractions  and  structures  containing  lambda  abstractions  are 
not  allowed  to  be  communicated  in  messages. 

Definition  (Kernel  Expressions): 

'XD  =  dOp  U  {procedure?,  app,  clc}  U  {self,  ready,  send} 

At  =  A  U  Bool  U  Z  U  Sym 

*T;  =  X  U  At  U  AX.*E  U  0„{E„)  U  if  (E,  E,  E)  U  letactor{(X  :=  ‘E)+}*E 
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We  let  x,y,z  range  over  X,  a  ranges  over  A,  ranges  over  “E,  ranges  over  ‘M.  The  binding  constructs  are 
letactor  and  A.  \x.e  binds  the  variable  x  in  the  expression  e.  letactor{ ..,Xi  ;=  *‘e,- . . . }''e  binds  the  Xi  in 
each  of  the  ‘'cj,  and  also  in  Two  expressions  are  considered  equal  if  they  are  the  same  up  to  the  renaming  of  bound 
variables.  For  any  expression  e,  we  write  FV(e)  for  the  set  of  free  variables  of  e.  We  write  e'Jx  :=  e]  to  denote  the 
expression  obtained  from  e'  by  simultaneously  replacing  all  free  occurrences  of  x  by  e,  avoiding  the  capture  of  free 
variables  in  e.  We  use  standard  abbreviations:  let,  for  lambda  application;  boolean  functions  not,  and,  boolean  func¬ 
tions;  and  letrec{*'^dj  =  Ax.'‘e}j<^<it  for  mutual  recursive  definition.  We  also  use  the  following  definitions 
for  structuring  message  contents. 

list„  =  Axi.Ax2 —  Ai„cons(xi,  cons(x2, . . .  cons(xri,nil))) 
niSgMk  =  AXmid -AXargs ’AXcust  •li®^3{^niid )  ^args?  ^cust) 

As  indicated  earlier,  the  semantics  is  given  by  defining  an  actor  theory,  ^AT.  The  only  primitive  sort  of  MT  that 
remains  to  be  defined  is  the  set  of  kernel  actor  theory  states,  ‘S. 

Definition  ('S):  '*S  =  {‘'e  6  “TE  |  FV{*'e)  =  0} 

The  acquaintances  of  an  state  (or  message  contents  for  that  matter)  is  simply  just  the  actor  names  occurring  therein  and 
renaming  is  simply  substitution.  The  meaning  of  a  kernel  program  is  defined  to  be  a  configuration  of  as  follows. 

Definition  (pPj):  Let  be  given  by 

prograin(receptionists  :  p,  externals  :  x^  library  : 
actors  ;  {  messages  ;  {oj 

then  l^Pj  =  Y  where  U' }]<j<n 

To  complete  the  semantics  all  that  remains  is  to  define  the  reaction  rules.  To  do  this  we  decompose  each  non-value 
expression  as  a  reduction  context  filled  with  a  redex.  Reduction  contexts  identify  the  subexpression  of  an  expression 
that  is  to  be  evaluated  next  using  the  standard  call-by-value  reduction  strategy  of  [Plo75)  and  were  first  An  expression 
e  is  either  a  value  or  it  can  be  decomposed  uniquely  into  a  reduction  context  filled  with  a  redex.  Thus,  local  actor 
computation  is  deterministic. 

Definition  {‘W  ‘Erdx  The  set  of  values,  ‘V,  the  set  of  redexes,  ‘TE^dx,  and  the  set  of  reduction  contexts, 

‘n,  are  defined  by 

•V  =  At  U  cons('W,  “V)  U  AX.'TE 

•^reix  =  ('^n('V”)  -  consCV,^^))  U  if  U  letactor{(X  := 

•n  =  {  .  }  U  u  if  (Tl, 

We  let  range  over  *TL.  With  the  exception  of  the  actor  primitives  letactor  send,  and  ready,  reduction  steps 
are  silent  -  they  only  depend  on  information  local  to  the  executing  actor  and  only  effect  the  state  of  the  executing 
actor.  Thus  we  define  a  sequential  reduction  relation,  e  e',  on  expressions  that  lifts  uniformly  to  define  the  silent 
reaction  rules.  The  decoration  X  is  an  abstract  context  introduced  to  make  the  dependence  on  local  context  explicit. 
We  use  a  function  5e//('‘C)  that  extracts  the  name  of  the  executing  actor  from  To  define  the  sequential  relation,  we 

first  define  the  purely  functional  reduction  relation  r  e  which  gives  the  rules  for  redexes  that  do  not  manipulate 
the  reduction  context.  The  rules  are  standard  and  are  omitted.  The  sequential  reduction  relation  is  then  defined  by 
lifting  functional  reduction  and  adding  the  rule  for  clc. 

Definition  (Sequential  steps  (  )): 

(rdx)  ‘‘P[el if^e^k^V 

(clc)  ‘‘i2fclc(‘‘t;)|  app('^v,  Aa:.‘‘/2|a:|)  x  ^  FV(‘‘i?[nill) 

clc  captures  the  actors  local  continuation,  P,  as  a  function,  Ax.P|a:|,  and  applies  its  argument  S)  to  this  function,  in 
the  empty  reduction  context  (the  local  top  level).  We  let  — be  the  reflexive,  transitive  closure  of  — .  Now 
we  are  ready  to  define  the  reaction  rules  of  ^AT. 

Definition  {^RR): 

seq(a)  :  [''e]  „  =>  [‘'e']  „  if  V  where  self{\)  =  a 

send(a)  :  [*‘J?fsend(’‘t;o, *'ui)I ]  o 

ready(a,''M)  :  [’^i?|ready(%)l]  ^  ,  a  <i^M  =>  [app(‘'v,’'M)  ]  „ 
leta(a,  o)  ;  [‘'i?|letactor{x  :=  *‘e}  ‘‘e|]  o  =>  :=  o]]]  o  ,  {  :=  o)3 

if  Len(a)  =  Len(i)  =  m  and  5  fi  ac5(’‘/?[*'e,  '‘e])  =  0 
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Where  ^Emiti^Q  <  'S)i)  =  S;o  <  if  €  A  and  ‘Si  €  otherwise  it  is  0  The  meta-function  ^Bmit  prevents 
ill-formed  messages  from  getting  into  the  system.  The  labels  of  MT  are 

seq(A)  U  send(A)  U  ready(A,  U  leta{A,  A*) 

where  in  the  case  of  leta{a,  a)  we  require  o  ^  a.  acq{l)  is  just  the  union  of  old  and  new  except  for  the  delivery  label 
where  ac9{ready(a,  *‘M))  =  {a}  U  acqi^M).  Again,  renaming  is  just  substitution. 

4  The  User  Language 

The  user  language  has  the  same  variables,  basic  data,  actor  names,  and  data  operations  as  the  kernel  language.  In 
addition  we  assume  given  two  disjoint,  countably  infinite  sets  of  identifiers:  Funid  for  functions;  and  Behid  for 
behaviors.  Expressions  are  built  from  atoms  and  variables  by  the  following  operations:  application  of  primitive 
operations  to  sequences  of  expressions,  let  binding,  conditional  branching,  the  letactor  actor  creation  construct, 
and  asynchronous  and  synchronous  method  invocation.  The  primitive  operations  include  dOp  and  following  user 
primitives:  self,  as  in  the  kernel;  customer,  the  customer  of  the  current  message  (arity  0);  fid^,  user  defined  oper¬ 
ations  (arity  i)  for  i  £  £  Funid;  and  readyj,j^  specifying  the  behavior  for  the  next  message  (arity  i)  for 

i  £  N,  bid  £  Behid.  An  asynchronous  invocation  is  of  the  form  “Ca  <  The  target  of  the  request  is 

the  value  of  and  the  message  contents  has  method  name  mid,  arguments  “e  and  customer,  Once  the  target, 
arguments,  and  customer  are  evaluated,  nil  is  returned  as  the  value  and  the  requesting  actor  proceeds  with  its  com¬ 
putation  without  waiting  for  a  reply.  A  synchronous  invocation  (also  referred  to  as  a  request  or  remote  procedure  call) 
is  of  the  form  “e^  .  mid (“e]).  The  target  of  the  request  is  the  value  of  %  and  the  message  contents  has  rnethod  name 
mid,  arguments  “e.  The  requesting  actor  suspends  execution  until  a  reply  is  received.  A  ready  expression  is  of  the 
form  ready^.rf  „{“ea , . . . ,  “e„)  (also  written  ready(bt£i(“ei , . . . ,  “Cn))).  Execution  of  a  ready  expression  terminates 
processing  of  the  current  message  and  looks  for  the  next  message  enabled  for  the  behavior  bid  with  parameters  given 
by  the  values  of  the  “Cj.  If  there  is  no  enabled  message  in  the  local  message  queue  the  actor  waits  for  one  to  be 
delivered.  In  the  user  language  there  is  no  lambda  abstraction  and  thus  no  functions  as  values.  Instead,  each  program 
contains  a  librar)'  of  (mutually  recursive)  function  and  behavior  definitions.  A  behavior  definition  has  the  form 

behavior  bid{p){methodDefs). 

where  bid  a  the  behavior  identifier,  p  is  a  parameter  list  (a  list  of  distinct  variables),  and  methodDefs  is  a  set  of  method 
definitions.  A  method  definition  has  the  form 

method  {p)[disable  —  when“e‘^]“e”' 

where  mid  is  a  method  name  (a  symbol  from  Sym),  p  is  a  parameter  list,  is  the  [optional]  disabling  condition 
(assumed  false  when  not  present)  that  specifies  when  a  method  can  be  invoked,  and  “e""  is  the  method  body.  %  is 
required  to  be  functional,  i.e.  its  evaluation  involves  no  actor  primitives  other  than  self  or  customer.  For  consistency 
we  require  that  a  method  (i.e  a  method  identifier)  should  have  a  unique  definition  within  a  given  behavior.  The  free 
variables  of  constraints  and  method  bodies  must  be  among  the  method  parameters  or  the  behavior  parameters.  A 
function  definition  has  the  form 

functionyid(p)“e 

where  fid  is  a  function  identifier,  p  is  a  parameter  list,  and  “e  is  an  expression,  the  function  body.  The  free  variables 
of  the  function  body  must  be  among  the  function  parameters. 

Definition  (User  Programs  and  Libraries); 

“M  =  MethId['V']@(A  U  {nil})  'T*rogram  =  prograBi(receptioiiists  :  Pw[A],  externals  :  Vu>[^] 

library  :  Vu  [(BehDef  U  FunDef)] 

actors  ;  Tu>[^  ~  messages  ;  Afw[A  <3  *M]) 

where  the  actor  names  in  the  actors  part  must  be  distinct,  and  all  actor  names  occurring  in  an  actor  state  or  message 
contents  must  either  be  one  of  the  actor  names  defined  in  the  actors  part  or  one  of  the  names  occurring  in  the 
externals  part.  We  let  “Af  range  over  'M.  We  let  c  range  over  A  U  {nil}  and  we  may  omit  the  customer  part  if  it  is 
nil.  Message  contents  consist  of  a  method  identifier  (symbol),  an  argument  list  (a  list  of  values),  and  a  customer  (an 
actor  name  or  nil  signifying  no  customer).  We  identify  Methld  with  Sym  and  let  mid  range  over  Methid  and  use 
mid  to  stand  for  a  symbol  used  as  a  method  identifier.  Tnid[x;]@c  abbreviates  the  list  construction  iisgMk(mid,  v,  c). 
We  use  the  the  following  meta  functions:  m$gMeth{M)  selects  the  method  component;  msgArgs{M)  selects  the 
arguments  component;  and  msgCust{M)  selects  the  customer  component.  A  library  is  well-formed  if  it  contains  at 
most  one  definition  of  each  fid  £  Funid,  and  bid  £  Behid,  and  these  definitions  themselves  are  well-formed.  We 
let  X,  y,  and  p  range  over  lists  of  distinct  variables.  User  expressions,  "E,  are  defined  in  a  manner  similar  to  kernel 
expressions  and  we  omit  the  details.  We  let  range  over 

As  for  the  kernel  language,  the  semantics  is  given  by  defining  an  actor  theory  MT.  Since  libraries  do  not  evolve 
we  parameterize  the  actor  theory  by  the  library  of  definitions  in  force,  letting  library  be  just  part  of  the  auxiliary  axioms 
describing  the  actor  theory.  Thus  to  give  the  semantics  we  need  only  define  user  states,  'S  (since  the  message  contents, 
“M.  have  been  explained  above)  and  give  '^RR,  relative  to  the  given  library. 

There  are  five  kinds  of  actor  states: 

•  ("e.  c,  Q)  -  processing  message  with  customer  c,  with  current  state  “e; 
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•  ibid('^').  Qt-  Qr)  -  traversing  the  queue  of  delivered  but  unprocessed  messages,  Q  =  Qi  *  Qr,  looking  for  an 

message  that  is  not  disabled.  The  current  behavior  is  The  message  (contents)  in  Qr  have  been  checked 

and  rejected  (i.e.  they  are  disabled).  The  messages  in  Qi  are  yet  to  be  checked; 

•  {ip,  “Af,  btd{^),  Qt..  Qr)  checking  disabling  constraints  of  behavior  bid  for  message  “AT . 

•  (a'  c.  Q)  ~  waiting  for  a  reply  to  a  request.  '^R  is  a  reduction  context  —  the  continuation  of  the  computation 
upon  receipt  of  an  answer,  a'  is  an  actor  created  to  serve  as  a  reply  address,  to  distinguish  the  request-reply  from 
other  arriving  messages. 


•  (a)  -  the  state  of  an  actor  serving  as  a  reply  address  for  a  request  sent  by  a; 

where  a,  a'  are  actor  names,  “e  is  an  expression  (of  the  user  language),  c  is  a  customer  -  and  actor  name  or  nil,  ip  is 
functional  expression,  “M  is  the  contents  of  a  user  message,  and  Q  is  a  mail  queue  -  a  sequence  of  messages  (or  more 

precisely  their  contents).  .  ... 

A  state  of  the  form  (“e,  c,  Q)  is  an  execution  state.  It  attempts  to  step  by  decomposing  “e  into  a  reduction  context 
and  redex  and  reducing  the  redex.  It  is  hung  if  the  redex  fails  to  reduce.  A  state  of  the  form  Qi,  Qr)  is  an 

execution  state  if  Qi  is  not  empty.  If  ‘‘D  does  not  match  the  parameter  list  of  the  definition  of  feid  then  the  state  is 
huno  Otherwise  it  steps  by  starting  evaluation  of  the  constraints  associated,  in  the  behavior  definition  tor  bid,  with 
the  method  name  of  the  first  message  in  Qi.  If  Qi  empty  then  the  state  is  waiting  for 

already  walked  through  its  queue  and  found  no  enabled  messages).  A  state  of  the  form  {<p,  M ,  btd{y),  Qr,  j  steps 
by  evaluating  <p  one  step  if  it  is  not  a  value  expression.  If  (p  is  the  value  f ,  then  it  starts  evaluation  of  the  method  body 
associated  with  the  method  of  “M  in  the  behavior  definition  for  bid.  If  y?  is  a  value  other  than  f  then  Mis  considered 
disabled  and  put  on  the  end  of  rejects  queue,  Qr.  States  of  the  last  two  forms  occur  in  pairs  [  a  ,  it,  c,  y  ]  a  ,  [  o]  a' 
that  are  waiting  for  a  reply  to  a  request  by  a  that  will  arrive  as  a  message  to  o',  serving  as  a  unique  request  identifier. 

The  meaning  of  a  user  program  is  defined  to  be  a  configuration  of  “AT  as  follows. 

Definition  (|“Pl):  Let  '^P  be  given  by 


=  prograin(receptionists  ;  p,  externals  :  x,  library  :  Lib 

actors  :  {cj  :=  messages  :  {o'  < ''Afj}j<^<„) 


then  |“P]  =  (“f  where  “7  =  {  [“ej,nil,nil]  ^“^j}i<j<n 

To  complete  theJiefinition  of  “AT  we  must  give  the  reaction  rules.  We  first  define  some  auxiliary  meta  functions 
and  predicates  to  ease  definition  of  rules  concerning  behaviors  and  methods:  behMatck{Lib,  bid,  *^)  t«ts  whether  the 
parameters  of  ready  expressions  match  those  of  the  behavior  definition;  cstrExp{Lib,  bid,  %,  mid{'^  )@c)  extract  the 
constraint  associated  with  a  method;  and  methExp{Lib,  bid,^,  mid{'^')@c)  extracts  a  method  body  from  a  library 
given  a  behavior  identifier,  a  parameter  list,  and  a  message.  We  write  “e[p  :=  u]  for  the  simultaneous  ^sfitufion  of 
the  ith  value  in  v  for  the  ith  variable  in  p.  This  is  defined  only  when  p  and  v  have  the  same  length.  T^e  definition 
of  cstrExp  reflects  the  fact  that  a  message  is  considered  disabled  if  there  is  no  matching  meAod  definition  and  that 
messages  with  matching  method  definitions  are  by  default  enabled  if  there  is  no  explicit  disabling  constraint.  ^ 

As  in  the  kernel  language,  to  give  the  reaction  rules,  we  first  define  the  sequential  reduction  relation  “e  — 
parameterized  bv  and  abstract  context  %.  We  also  use  a  function  custoTner{X)  to  extract  the  customer  of  the  current 
message.  We  define  the  values  “V,  reduction  contexts  “R  and  r^exes  “E^dx  of  the  user  language,  anaogous  to  the 
kernerianguage  definitions,  again  giving  the  unique  decomposition  property  for  non-value  expressions.  We  let 

range  over  “R.  The  relations  and  are  defined  similarly  to  the  kernel  case  and  again  we  omit  details. 

lej  — be  the  reflexive,  transitive  closure  of  •  Notice  that  the  sequential  rules  are  sufficient  to  evaluate 

functional  expressions,  in  particular  we  only  need  the  sequential  rules  to  check  constraints. 

The  labelled  reaction  rules  for  the  user  language  are  given  by  the  following. 

Definition  {'^RR)'. 

seq(a):  [Vc,  (?]<,=>  [V,c,Q]a  if“e-^a,cV 

send(a)  :  rR[^v  <  rmd{%)Wl  c,  Q]  a  =>  [“72[nil],  c,  Q]  „  ,  '^Emit{'‘v  <  mid{%Wv') 
rpc(a,ao):  .  mid(“u)l,  c,  C?]  „  =>  [ao,“i2,  c,  Q]  ,  [oioo,  ''Emit{'^  <  mid{'^)@ao) 

if  00  ^  aa](  •  midCi;)],  c,  Q )  a) 

rcv(a.  Co,  “u)  ;  [  flOi d  o  »  [o]q<j,  oq  rntd([“v]  * ‘'u)@c' =4*  c,  Q]  «  , 

deliver(a,“.\f)  :  [ 6id('^),  [],(?]  a  ,  ac“Af=4>  [ 6t-d(‘^),  [“A7],  (? )  „ 
walk(fl)  :  [“i?[ready(fcid(“u))l,“Ar,  gia  Q,  []]  a  if  behMatch{Lib,bid,'^) 

leta(«,a):  [“f2|letactor{i  :=  “e}“e],  c,  0 1  a  =1^  [“7?|“e[i  :=  a]|,  c,  Ql  a  »  {  ^  a.  }i<t<fc 

if  Len(Q)  =  Len(x)  =  k  and  a  n  acg(”J?Iletactor  (x  ;=  “e}“e],  c,  (?)  =  0 
cstr(a)  :  lbid{%),[^M]  *  Qi,  Qr)  .  ^  [cstrExp{Lib,bid,'^,^M),'‘M ,bid{'^),  Qi,  Qr]  a 
enable(a):  [i ,^M .  bid{^),  Qt,  Qr)  a  =>  [methExp{Lib,bid,'^,'^M),msgCust{'^M),  Qt  *  Qr]  a 
disable(a)  :  ,bid{'i}),  Qi,  Qr]  a  [bid{'^),  Qi,  Qr  *  [‘M]]  ^ 

check(a)  :  [%, “M,  bid(^),  Qt,  Qr]  a  ^  ,bid{^),  Qi,  Qr]  a  if”eo  -^a.,n.gCusti-M)  “ei 
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where  <  “AT)  =  “tj  <  “Jl/  if  "v  G  A  and  msgCxist^M  e  A  U  {nil},  otherwise  it  is  0.  As  in  the  kernel 

language,  the  meta-function  ^Emit  prevents  ill-fonned  messages  from  getting  into  the  system. 


5  A  Semantics  Preserving  User  to  Kernel  Translation 

In  this  section  we  define  a  translation,  u2k  :  and  show  that  this  translation  preserves  interaction  semantics. 

u2k  is  a  family  of  maps,  one  for  each  syntactic  category.  The  members  of  the  family  are  distinguished  by  context 
of  application  rather  than  by  name.  Programs  are  translated  by  translating  the  library,  actors,  and  messages  parts.  A 
library  is  translated  by  translating  the  function  and  behavior  definitions,  producing  a  kernel  language  library.  An  actor 
description  is  translated  by  translating  the  expression  part  assuming  it  executes  in  a  local  context  in  which  the  current 
message  has  no  customer  and  message  queue  is  empty.  A  message  description  is  translated  by  simply  eliminating  the 
syntactic  sugar.  The  core  of  the  translation  is  its  behavior  on  expressions.  Expressions  are  translated  in  the  context 
of  a  user  library.  In  order  to  leave  this  dependence  implicit,  we  adopt  a  standard  convention  about  converting  user 
function  and  behavior  identifiers  into  variables  and  assume  sufficient  renaming  has  been  done  to  avoid  conflicts.  The 
translation  u2k{\)  of  a  user  expression  is  a  lambda  term  of  the  form  Ac.A^.’^e  which  when  applied  to  a  customer,  c, 
and  a  message  queue,  q,  (represented  as  a  list)  reduces  to  a  kernel  expression  that  corresponds  to  the  user  expression 
executing  in  a  local  context  where  the  current  message  has  customer  c  and  message  queue  elements  are  the  elements 

of  q.  We  use  the  following  abbreviation  u2k*{'^e,  c,  g)  =  app(u2A:(“e),  c,  O'))  in  defining  the  translation. 

The  translation  of  the  expression  forms  that  are  common  to  the  two  languages  as  well  as  customer  and  asyn¬ 
chronous  send  are  straightforward.  It  amounts  to  passing  the  customer  and  message  queue  parameters  to  the  translated 
subexpressions.  The  translation  of  synchronous  invocations  (requests)  and  ready are  where  care  is  needed.  In 
the  user  language,  the  transition  that  delivers  the  reply  to  a  request  involves  two  actors,  the  actor  requesting  the  reply 
and  the  actor  created  to  serve  as  the  reply  address,  as  well  as  the  reply  message.  Kernel  actor  transitions  involve  at 
most  one  actor.  The  three-body  interaction  is  replaced  by  a  delivery  to  the  reply  address  followed  by  a  forwarding 
and  delivery  to  the  requesting  actor.  To  avoid  forgery,  we  introduce  a  third  actor  which  has  null  behavior  and  simply 
serves  as  a  secret  key  known  only  by  the  requestor  and  the  actor  serving  as  the  reply  address.  The  forwarded  message 
is  tagged  with  this  key. 

The  translation  of  a  ready expression  must  produce  code  to  walk  the  message  queue,  checking  the  disabling 
constraints  for  the  method  of  each  message.  If  an  enabled  message  is  found,  then  the  translated  method  body  is 
executed.  If  the  end  of  the  queue  is  reached,  then  the  actor  executes  ready  with  a  behavior  that  treats  the  next  message 
delivered  as  the  next  element  of  the  message  queue  to  check. 

We  begin  by  defining  the  mapping  on  programs,  and  work  our  way  down  to  expressions.  Programs  are  translated 
as  follows: 

u2A:(program(reception.ists  :  p  externals  :  x  ^  program{receptioiiists  :  p  externals  :  x 
library  :  library  :  u2k{'^Lih) 

actors  :  {oi  :=  actors  ;  {a,-  :=  u2fc*(“e,-,ail,nil)}j<f<(. 

messages  :  {aj  <  Af,  }j<.<„))  messages  ;  (aj  < 

To  translate  function  definitions  we  associate  a  kernel  function  symbol  to  each  user  function  identifier,  fid. 
The  translation  of  a  definition  of  fid  yields  a  definition  of  ^d.  The  translation  of  behavior  definitions  is  a  little  more 
complex.  For  each  defined  behavior  identifier,  bid,  the  translation  consists  of  definitions  of  three  operations:  ^bid,  the 
top  level  behavior  function;  Qwalk[6id],  controls  the  queue  walking  for  bid;  and  Mcheck[fetd],  checks  constraints  for 
a  particular  message. 

User  functions,  behaviors  and  methods  have  parameter  lists.  The  translated  operations  will  be  applied  to  a  list  of 
arguments  and  must  check  if  the  number  of  arguments  is  correct  and  then  bind  these  to  the  individual  parameters.  For 
this  purpose,  we  define  a  family  of  abbreviations  parBind[p,  v,  e]  that  binds  elements  of  p  to  corresponding  elements 
of  V  in  e.  The  translation  of  a  function  definition  is  given  by: 

u2/c(function/id(p)”e)  ^'"fid  :=  Ac.A^.Ay.if (not(equal?(length(7/),n)), 

hang, 

parBind(p,  y,  u2k’{'^e,  c,  q))) 

where  Len(p)  =  n  and  hang  is  some  functional  redex  that  fails  to  reduce,  for  example  car(nil),  thus  hanging  the 
computation  if  the  arguments  do  not  match  the  parameters. 

Let  methodDefs  be  (method  mtd,(pi) [disable  -  when  |  i  <  mna)  (“ef  is  taken  to  be  to  f  if  no 

disabling  constraint  is  present).  The  translation  u2fc(behavior  bid (p)  (methodDefs))  of  a  behavior  definition  with 
Len(p)  =  n  is  %id  =  A^.Ai/.if (not(equal?(length(y), n)), nil, Qwalk[6i(i]($,nil, j/))  Qwalk[l)id](q'/, 2/) 
waits  for  a  message  to  be  delivered,  if  qi  =  nil  and  otherwise  calls  Mcheck[6td](cdr(gi),  qr,  j/)(car(g/)  to  check  the 
first  element  of  qi.  Mcheck[(»td](g/ ,  gr,  y)(“Af)  looks  for  a  method  definition  matching  “M.  If  none  exists,  or  if  the 
matching  method  method  is  disabled  relative  to  the  behavior  parameters,  y  and  the  message  arguments,  then  it  calls 
Qwalk[fetd](9/,  append(9r ,  list(“Af)),  y).  Otherwise  it  reduces  to  the  appropriately  instantiated  method  body. 

We  give  the  clauses  for  the  most  interesting  cases: 

u2fc(c'Ustomer())  =  Xc.Xq.c 

u2k{fid("ei  ,...,”£„))  =  Xc.Xq.Aj>p(^fid,  c,  q,  list„  (uSt:*  (“ei ,  c,  g), . . .  u2k‘‘  (%,  c,  g))) 
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u2fc(letactor{a,  ;=  “e)  =  Ac.A9.1etactor{ai  :=  uSfc’fei, c,nil)}i< .<„o2A*re, c, g) 


u2i: (r e ady ( 6i(i ( ‘’e  1, . . . ,  “e„)))  =  Ac.Ag.let{a:i  :=  u2A:*{“ei,  c,g)}i<j<„ 

let{y  ;=  .  ,a;n)} 

clc(A/.app('‘6i<i,g,y)) 


u2kCeo  .  Tnid[“ei , . . . ,  “bt,])  = 

Ac.Ag.let{xi  :=  u2fc’(“ei,c,g)}o<^<„ 

let{m»rgs  listn(a;i, . . .  ,Xn)} 
letactor{ai,ey  :=  nil} 

let{fe  :=  RpcAnz(self  0,  ai<«y)} 
letactor{ty  ready(6)} 

seq(send(xO)  ns  gHk(  mid,  rriargB,  t^)), 
clc(Afc.ready(RpcWait(fc,  Okey)))) 

where  the  following  definitions  are  also  added  to  the  generated  kernel  library  of  any  program  translation 

RpcAux  =  Axa.AxkeyA77i,send(xa,msgHk(nil,listi(car(iiisgArgs(Tn))),Xk^y) 

RpcWait  =  AA:.Axk,yAm.if{equal?(ikey,msgCust(7n)), 
app(A:,  car(msgArgs(7n))), 
seq(Eend(sGlf  (),m),  ready(RpcWait(A:,Xkey)))) 

To  establish  correctness  of  the  user-kernel  translation,  we  extend  it  to  actor  theory  configurations  and  show  that 
this  mapping  preserves  interaction  semantics.  The  following  lemma  says  that  the  user-to-kernel  translation  commutes 
with  the  meaning  function  on  programs.  This  formalizes  the  commuting  diagram  of  §1. 

Lemma  (user-to-kemel.l):  For  any  user  program,  “P,  =  u2k{l'^P}) 

Proof :  By  calculation  using  the  definitions  of  |  J,  and  u2k. 

The  main  work  of  the  proof  is  in  the  following  lemma. 

Lemma  (user-to-kemel.2):  For  any  user  configuration  '^K  we  have 

Isem{^K)  =  /sem(u2A:(“/ir))|’'M 

The  main  theorem  is  an  easy  consequence  of  the  above  lemmas. 

Theorem  (user- to- kernel): 

IsemCP)  =  /s€m{u2it(”P))rTvl 


6  Conclusions 

The  main  technical  contribution  of  this  paper  is  to  present  a  method  for  establishing  equivalence  of  actor  systems, 
or  more  generally  for  distributed  object-based  systems.  The  main  result  of  this  paper  is  a  proof  of  correctness  of 
what  is  essentially  a  stage  of  compilation  of  a  high-level  actor  language.  In  [PT94]  high-level  object  programming 
constructs  are  explained  by  expansion  in  the  Piet  language.  In  [Wal95]  a  semantics  for  a  variant  of  POOL  is  given 
via  translation  to  a  soned  Pi  calculus.  This  is  shown  to  be  a  simulation  (up  to  bisimulation  )  of  a  direct  transition 
system  operational  semantics  of  POOL.  Core  Facile  is  a  synthesis  of  the  typed  lambda  calculus  and  pi-calculus  style 
concurrency  primitives.  In  [Ama94]  a  translation  from  Core  Facile  to  a  variant  based  on  asynchronous  communication 
is  given.  The  translation  of  a  process  is  shown  to  preserve  barbed  bisimilarity  and  barbed  congruence  of  the  translation 
of  two  expressions  implies  congruence  of  the  expressions.  The  converse  is  left  open.  The  translation  goes  by  an 
intermediate  language  obtained  by  adding  a  control  operator  to  the  asynchronous  Facile  much  as  we  have  done  in 
the  kernel  language.  In  (AP94]  an  extension  of  the  Pi-calculus  to  model  locality  and  failure  is  trarislated  in  to  a 
simply  sorted  Pi-calculus  and  similar  properties  are  proved  for  the  translation.  Our  approach  differs  in  giving  both 
languages  an  abstract,  composable  semantics  in  the  same  semantic  domain  and  showing  that  the  translation  preserves 
the  abstract  semantics.  The  notion  of  barbed  bisimulation  seems  to  share  with  abstract  actor  structures  and  interaction 
semantics  the  objective  of  hiding  details  of  internal  computation.  More  detailed  investigation  of  the  relation  between 
these  approaches  is  an  interesting  topic  for  future  work. 
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Abstract.  In  this  paper  we  address  min-max  equations  for  periodic  and 
non-periodic  problems.  In  the  non-periodic  case  a  simple  algorithm  is 
presented  to  determine  whether  a  graph  has  a  potential  satisfying  the 
min-max  equations.  This  method  can  also  be  used  to  solve  a  more  general 
quasi  periodic  min-max  problem  on  periodic  graphs.  Also  some  results 
regarding  the  uniqueness  of  solutions  in  the  latter  case  are  given. 


1  Introduction 

Min-maix  problems  can  be  considered  to  be  a  generalization  of  a  variety  of  graph 
problems  involving  potentials.  There  is  a  close  relationship  with  network  flow 
problems  (non-periodic  case),  see  e.g.  [1],  and  the  well  known  maximum  cycle 
mean  problem  (periodic  case),  see  e.g.  [10],  [8].  In  particular,  previous  results  in 
the  non-periodic  case  can  be  related  to  a  feasible  potential  function  p  observing 
lower  linear  constraints 

p(t'j)  <  P(^i)  +w{vi,Vj)  'ivj  G  V~ ,{vi,Vj)  e  E  (1) 

and  an  optimal  potential  function  p  using  min  constraints 

p{vj)  =  min  {p{vi) io{vi,Vj)}  'ivj  eV  (2) 

{vi,vj)eE 

associated  with  some  network  Q'~  {V" ,  E,w).  Our  paper  addresses  a  generaliza¬ 
tion  where  these  sets  of  inequalities  are  mixed  for  a  given  network  G{V,E,w) 
with  their  dual  forms,  that  is  upper  linear  constraints 

p{yj)  >  pM  +u}{vi,Vj)  yvj  G  ^  E  (3) 

and  max  constraints 

p{vj)  =  max  {p(^^i)  +  Vnj  G  (4) 

{vi,vj)eE 

with  U+UU“  =  V.  If  a  distance  function  d  is  given  additionally,  the  correspond¬ 
ing  quasi  periodic  problem  deals  with  edge  weights  w{vi,Vj)-\{vj)d{vi,  Vj)  where 
a  specific  period  \{vj)  is  associated  with  each  node  Vj. 
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In  the  area  of  interface  timing  verification,  see  [11,  17,  19],  problems  related 
to  the  existence  of  min  and/or  max  constraints  frequently  occur.  There,  the  dif¬ 
ference  between  the  potentials  of  two  nodes  must  be  maximized  under  various 
constraints.  In  particular,  it  is  possible  to  transform  one  of  the  problems  ad¬ 
dressed  in  [11],  [17]  and  [19]  to  a  problem  with  mixed  constraints  (1),  (2)  and 
(3).  Different  pseudo-polynomial  algorithms  are  derived  for  the  solution  of  this 
problem  based  on  iterative  tightening  [11],  removing  negative  cycles  [17]  and 
maximum  separations  [19].  However  so  far,  neither  a  polynomial  algorithm  nor 
a  proof  of  intractability  is  known. 

In  comparison  to  these  results,  we  are  mainly  dealing  with  constraints  (2) 
and  (4).  Note  that  constraints  of  the  form  (1)  or  (3)  can  easily  be  converted  into 
constraints  (2)  and  (4)  by  a  simply  adding  one  additional  node  and  two  edges  for 
each  node  Vj  with  constraints  of  type  (1)  or  (3).  Regarding  the  non-periodic  case 
our  paper  presents  efficient  pseudo-polynomial  algorithms  for  finding  optimal 
potentials  satisfying  constraints  (2)  and  (4). 

The  consideration  of  constraints  (4)  in  connection  with  periodic  graphs  has 
raised  significant  interest  in  the  past,  as  it  is  the  root  for  many  problems  from 
different  application  areas,  see  [9,  14,  10,  8].  This  includes  e.g.  control  theory 
and  manufacturing  [5],  timing  properties  of  discrete  event  systems  [15],  parallel 
algorithms  [16],  and  other  areas  of  computer  science.  A  comprehensive  treatment 
of  the  theory  and  its  applications  can  be  found  in  [2].  Especially  the  use  of  linear 
equations  over  a  new  max-plus  algebra  [5,  2]  has  produced  many  results.  Some 
of  these  results  have  even  been  generalized  to  problems  which  are  periodic  in 
multiple  dimensions,  see  [3]. 

Driven  by  application  areas  like  asynchronous  circuit  design,  timing  and  pro¬ 
tocol  verification,  and  timing  behavior  of  general  Petri  nets,  some  recent  ap¬ 
proaches  addressed  the  generalization  of  these  results  to  dynamic  graphs  with 
constraints  of  the  form  (4)  and  (2).  These  dynamic  min-max  systems  have  been 
investigated  in  [12,  13,  2].  Further  results  in  this  direction  are  described  in  [6,  7]. 
However,  the  models  used  in  these  two  groups  of  papers  are  quite  different.  Olsder 
[12,  13]  describes  a  periodic  min-max  problem  in  terms  of  an  eigenvalue  problem, 
whereas  Gunawardena  [6,  7]  defines  a  certain  class  of  min-max  functions.  Both 
models  are  special  cases  of  those  used  in  our  paper. 

Also  with  respect  to  numerical  procedures  and  the  uniqueness  of  the  period, 
the  results  in  [12,  13]  are  restricted  to  a  subclass  of  min-max  problems.  On 
the  other  hand,  [6,  7]  contain  "complete”  results  in  the  case  that  only  two  dis¬ 
tances  have  the  value  1  while  all  others  are  zero.  For  all  other  considered  cases 
{d{vi,Vj)  e  {0, 1}),  there  is  no  procedure  which  decides  whether  a  min-max  sys¬ 
tem  has  a  period  or  not.  Moreover,  the  given  algorithm  for  the  computation  of 
the  period  is  exponential  in  the  size  of  the  graph.  In  this  area  our  paper  contains 
the  following  new  results; 

-  A  relation  between  potential  functions  of  dynamic  and  weight  transformed 
static  graphs  is  derived.  This  is  similar  to  a  known  result  for  max-plus  prob¬ 
lems  [4]. 

-  Results  on  the  uniqueness  of  the  periods  in  the  quasi-periodic  case  are  given 
as  well  as  algorithms  to  determine  these  periods. 
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2  The  Static  Min-Max  Problem 

2.1  Definitions  and  Properties 

We  start  this  section  by  defining  various  forms  of  graph  potentials. 

Definition!  Min-Max  Potential.  Assume  a  weighted  digraph  Q{V  =  U 
V-,E,  iv)  with  n  P"  ^  0,  E  C  P  X  P  and  w  :  E  Q,  also  called  min-max 
graph  subsequently.  Then,  a  potential  p  :  P  — ^  Q  is  called  feasible  if 

/  >>  /  >  PiVj)  +w{vj,Vi)  ^(vj,Vi)  €  E,  Vi  e 

^  1  <  P{^j)  +  wlvj,Vi)  \/{vj,Vi)  £  E,Vi  ^  . 

Further,  a  feasible  potential  p  :  P  Q  is  a  min  potential  if 

p(vi)  =  min  {p{vj)  -^w{vj,Vi)}  ^Vi  £  V~ . 

{vj,vi)eE) 

Similarly,  a  feasible  potential  p  \V  Q  is  a  max  potential,  if 
p{vi)  -  max  {p{vj)  '\-'w{vj,Vi)}  \/vi  £ 

{vj,vi)eE) 

Finally,  a  potential  p  :  P  ^  Q  is  a  min-max  potential  if  it  is  a  min  potential 
and  a  max  potential  at  the  same  time.  * 

The  definition  of  a  min-max  potential  directly  leads  to  our  first  key  problem: 

Problem.  2.  Is  there  a  min-max  potential  for  a  given  min-max  graph  Q  9 

The  problem  can  be  simplified  by  using  the  following  few  observations: 

1.  If  ^  consists  of  two  independent  graphs,  it  is  sufficient  to  consider  each  graph 
separately. 

2.  If  =  (y,  E  n  (y+  x  contains  a  positive  weight  cycle,  then  there 

is  no  min-max  potential  for  Q  (positive  cycle  in  a  longest  path  problem). 

3.  If  Q~  =  {V,En  {V-  xV~),w)  contains  a  negative  weight  cycle,  then  there 
is  no  min-max  potential  for  Q  (negative  cycle  in  a  shortest  path  problem). 

4.  It  suffices  to  only  consider  bipartite  min-max  graphs  where  E  C  (y+  x 
V~)  U  {V~  X  y+)  as  additional  nodes  can  be  inserted  without  changing  the 
problem  substantially.  A  proof  of  this  claim  is  given  in  [18]. 

Therefore,  we  assume  for  the  remainder  of  this  section  that  ^  is  a  connected 
bipartite  graph  and  that  for  each  node  Vj  £  V  there  is  at  least  one  edge  {vi,Vj)  £ 
E. 

In  the  next  corollary  we  show  that  knowledge  about  a  min  potential  for  a 
graph  can  provide  some  information  about  min  potentials  for  related  graphs. 
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Corollary  3.  If  a  bipartite  graph  Q(V^E^w)  has  a  min  potential,  then  there  is 
also  a  min  potential  for  any  graph  Q'{V,  E,w')  with  w'{vi,Vj)  <  w{vi,Vj)  for 
all  (vi,Vj)  G  E.  On  the  other  hand,  if  a  bipartite  graph  Q{V,E,w)  has  no  min 
potential,  then  no  min  potential  exists  for  any  graph  Q'{V,  E,  w')  with  w'{vi,Vj)  > 
w{vi,Vj)  for  all  {vi,Vj)  G  E. 

Proof.  Let  p  be  a  min  potential  of  Q  and  w'{vi,Vj)  <  w{vi,Vj)  for  all  {vi,Vj)  G  E. 
Then  p'  with 

7  .1  _  /P(*^i)  ^ 

^  \  +  w'{vj,Vi)}  for  Vi  e  V- 

is  a  min  potential  for  Q' ,  as  p'{vi)  >  p{vj)  +  w{vj,Vi)  >  p'{vj)  +  w'{vj,Vi)  for 
all  Vi  G  V'^  and  {vj,Vi)  G  E.  The  second  claim  of  the  corollary  is  a  direct 
consequence  of  the  first  one.  ■ 

Of  course,  a  similar  corollary  holds  for  max  potentials  as  well.  It  is  easy  to  see 
that  ‘tight’  edges  {vi,Vj)  of  a  min-max  graph  with  p{vj)  =  p{vi)  +  w(vi,Vj)  are 
especially  important.  For  any  min-max  potential  p,  there  must  be  a  tight  input 
edge  for  each  node  vj  G  V.  Also,  a  min-max  potential  for  a  graph  Q  implies  the 
existence  of  a  cycle  C  consisting  of  tight  edges.  In  Q  this  cycle  C  must  be  a  zero 
weight  cycle,  i.e.  =  0- 

Moreover,  we  can  restrict  ourselves  to  those  min-max  potentials  where  Qp 
is  connected.  Then,  the  difference  between  the  min-max  potential  values  of  any 
two  vertices  \p(vi)  -  p{vj)\  is  bounded  by  the  maximum  length  of  any  simple 
(undirected)  path  in  Q.  For  such  a  path  we  can  use  the  following  upper  bound 
s: 


s  = 


E( 


max 

{vi,vj)eE 


(5) 


2.2  Algorithms 

Now,  we  describe  a  method  to  determine  whether  a  bipartite  weighted  digraph 
has  a  min-max  potential.  This  method  is  based  on  Function  increase  in  Table  1. 

The  following  corollary  describes  the  possible  outcome  of  Function  increase, 
see  [18]  for  a  detailed  proof. 

Corollary  4.  If  and  only  if  the  bipartite  min-max  graph  Q  has  a  min  potential, 
then  Function  increase  returns  ‘true^  and  the  generated  potential  p  is  a  min 
potential.  ■ 

Any  change  of  the  potential  of  a  node  Vi  G  requires  that  at  some  time 
during  the  execution  of  the  function  there  was  a  node  Vj  G  V~  with  p(vi)  = 
p{vj)  +  w{vj,Vi).  On  the  other  hand,  if  p{vi)  =  p{vj)  +  w{vj,Vi)  at  any  time 
during  the  execution  of  the  loop  for  a  node  Uj  G  F+,  then  there  is  a  tight  edge 
{vk,Vi)  for  some  node  Vk  G  V~  provided  the  function  returns  ‘true’.  Hence,  if 
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Boolean  Function  increase(^,p,  { 
in  Q]  inout  p;  out  Qt] 

a  =  max{p(v)  |  v  G 

loop:  p{vj)  =  mm{p(vi)  -\~w(vi,Vj)  \  (vi^Vj)  G  E]  for  ail  Vj  e  V  ; 

if  (3(z;j,i;x)  G  E  with  Vi  G  F+  and  p{vi)  <  p{vj)  w{vj  ,Vi))  { 

p{vi)  =  p{vj)  w{vj,vi)]  } 

else  {  Gt  =  return  ‘true’;  } 

if  (there  is  no  change  in  the  potential  of  any  node  Vi  with  p{vi)  <  a  +  s)  { 
Gt  =  subgraph  of  G  induced  by  all  nodes  with  p{vi)  <  a  +  s; 
return  ‘false’;  } 
goto  loop; 

} 


Table  1.  Function  increase 


Function  increase  starts  with  a  max  potential  and  returns  ‘true’,  the  generated 
potential  p  will  be  a  min-max  potential.  This  leads  directly  to  the  following 
theorem: 

Theorems.  Q  has  a  min-max  potential,  if  and  only  if  it  has  a  min  potential 
and  a  max  potential.  * 

Therefore,  a  min-max  potential  of  G  can  be  detected  by  first  applying  Func¬ 
tion  increase  to  an  arbitrary  initial  potential  and  then  applying  its  dual  coun¬ 
terpart  Function  decrease  to  the  resulting  potential.  The  existence  of  a  min-max 
potential  for  G  requires  that  Function  increase  and  Function  decrease  both  return 
‘true’. 

This  procedure  constitutes  a  pseudo  polynomial  way  to  solve  the  min-max 
problem.  However,  cycles  with  a  small  weight  sum,  like  e.g.  w{vi,Vj)-\-w{vj ,  Vi)  = 
e  — >•  0,  in  connection  with  large  edge  weights  may  lead  to  a  large  number  of 
iterations.  This  problem  is  addressed  in  [18],  where  we  introduce  improved  al¬ 
ternatives  to  functions  increase  and  decrease. 


3  The  Dynamic  Min-Max  Problem 

In  this  section  we  address  the  quasi-periodic  min-max  problem  on  dynamic 
graphs.  To  this  end,  we  first  define  dynamic  graphs  as  usual  via  static  graphs, 
see  e.g.  [9].  Then,  Problem  2  is  extended  to  dynamic  graphs. 

Definition  6  Static  Graph.  A  (bipartite)  static  graph  QsiV,  E,  w,  d)  with  V  = 
y  +  U  P”  is  a  bipartite  weighted  digraph  with  a  weight  function  w  :  E  ^  Z  and 
a  distance  function  d  :  E  ^  Z>o.  ■ 

Definition  7  Dynamic  Graph.  The  dynamic  graph  corresponding  to  a  given 
static  graph  Qs{V,  E,w,d)  is  an  infinite  weighted  bipartite  graph  Gd{Vdi  Ed^ujci) 
where 


Ea 


384 


-  {?;i(A;)  \vi  GV,k  e  E>o}, 

=  {{viik  -  d{vi,Vj)),Vj{k))  I 

ivi,Vj)  e  E,k  e  Z,k  >  d{vi,Vj)}, 

Wd{vi{k  -  d{vi,Vj)),Vj(k))  =w{vi,Vj)  for  ail  {vi{k  -  d{vi,Vj)),Vj{k))  €  Ed. 


Definition  8  Quasi-Periodic  Min-Max  Potential.  The  quasi-periodic  min- 
max  potential  pd  :  Vd  ^  Q  of  a,  dynamic  graph  Qd{Vd,Ed,Wd)  is  a  min-max 
potential  pd  for  all  k  >  K  =  max(^.,^.)ej5{d(t;i, Moreover,  there  is  a  period 
function  X  :  V  Q  such  that 

Pdiviik  +  1))  Pd{vi{k))  +  X{vi)  for  all  Vi{k)  e  Vd. 


Problem  9.  Is  there  a  quasi-periodic  min-max  potential  for  a  dynamic  graph 

QdiVd.Ed^Wd)? 

In  order  to  avoid  dealing  with  infinite  dynamic  graphs,  we  use  the  regularity 
of  those  graphs  to  describe  them  with  a  cycle  graph,  see  also  [1]. 

Definition  10  Quasi-Periodic  Cycle  Graph.  For  a  static  graph 
Qs(V,E,w,d)  and  a  period  function  X  :  V  Q  with  X{vi)  >  A(t'j)  for  all 
Vi  €  and  Vi,Vj  adjacent  in  Qs,  the  quasi-periodic  cycle  graph  Qc{V,Ec,Wc) 
is  defined  by 

Ec  =  {[vi.Vj)  e  E  I  X(vi)  =  X(vj)}, 

Wc(vi,Vj)  =  w(vi,Vj)  -  X(vj)d(vi,vj)  for  all  (vi,Vj)  €  Ec- 


Now,  the  following  corollary  establishes  a  close  relation  between  the  quasi- 
periodic  min-max  problem  of  a  dynamic  graph  and  the  min-max  problem  of  the 
corresponding  quasi-periodic  cycle  graph. 

Corollary  11.  Assume  a  static  graph  Qs{V,E,w,d).  Then,  the  following  two 
statements  are  equivalent: 

-  The  dynamic  graph  Qd  corresponding  to  Qs  has  a  quasi-periodic  min-max 
potential  pd  with  the  period  function  X. 

—  The  quasi-periodic  cycle  graph  Qc  corresponding  to  Qs  and  X  has  a  min-max 
potential. 
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Proof.  A  max  potential  of  Qd  requires  for  all  Vi{k)  e  Vj'  and  k  >  K  the  correct¬ 
ness  of  the  equation 

Pd{vi{k))  ^ 

max  {Pd{yj{k  -  d{vj,Vi)))  P  Wd{vj{k  -  d{vj,Vi)),Vi{k))}. 

{vj{k  —  d{vj  ,^!i)),Vi{k))^Ed 

Using  the  definition  of  a  dynamic  graph  and  the  periodicity  of  Pd  we  obtain 
the  equivalent  conditions 

Pdivi{0))  =  max  {prf(uj(0))4-iCc(uj,Ui)H-(/^-d(uj,Ui))(A(uj)-A(ui)))  \/k  >  K 

{vj,vi)eE 

(6) 

1.  Qd  has  a  quasi-periodic  min-max  potential  Qc  has  a  min-max  potential. 
First  assume  that  A(ui)  <  A(uj).  Then,  Qc  does  not  exist.  Also,  the  validity 
of  Equation  (6)  for  all  k  >  K  prevents  Pd(ui{0))  from  being  finite. 

On  the  other  hand  if  A(ui)  >  A(uj),  then  edge  {vj,Vi)  cannot  affect  Equa¬ 
tion  (6)  for  /c  — ^  cx).  Therefore,  it  need  not  be  considered  in  Equation  (6). 
This  results  in 

Prf(ui(0))  =  max  {Pd(^'j(0))  +  u;c{uj,Ui)}  V/c>  AT, 

{vj,vi)eEc 

which  leads  to  a  max  potential  Pc{vi)  —  P(i(ui(0))  for  all  vi  G 

2.  Qc  has  a  min-max  potential  Qd  has  a  quasi-periodic  min-max  potential. 
Suppose  that  a  cycle  graph  is  given  with  Pc{vi)  =  max(^^.,^,.)e£;^{Pc{^j)  + 
iUc{vj,Vi)}  for  all  Vi  G  K-  If  we  set  Pd(ui(0))  =  Pc(^i)  for  all  Vi  G  Ec,  then 
Equation  (6)  holds  as  A(ui)  >  A(uj)  and  the  third  term  in  the  max-expression 
becomes  sufficiently  negative  for  all  edges  in  E\Ec  and  k  ^  oo. 

Similar  arguments  are  used  for  all  Vi  (k)  G  .  ■ 

Due  to  Corollary  11,  the  solution  in  the  quasi-periodic  case  divides  Qs  into 
subgraphs  with  different  periods.  In  other  words,  the  quasi-periodic  cycle  graph 
Qc  of  Qs  consists  of  unconnected  subgraphs,  where  each  subgraph  has  a  min- 
max  potential  and  a  period  common  to  all  nodes.  This  suggests  an  algorithm  to 
determine  the  periods  and  subgraphs  by  iteratively  pealing  off  subgraphs  with 
decreasing  periods  from  the  static  graph.  Therefore,  at  first  the  case  of  a  single 
period  A  for  all  nodes  Vi  G  Qs  will  be  considered.  The  functions  lower-period 
and  dipper-period  are  introduced  to  determine  the  single  period  A  for  all  nodes 
G  Qs- 

Note  that  if  a  dynamic  graph  Qd  has  a  periodic  min-max  potential  with  period 
A,  then  the  corresponding  cycle  graph  Qc  has  at  least  one  directed  cycle  C  with 
Ilivi,vPeC^c{vi,Vj)  =  0.  Assuming  ^  results  in 

~  Z(vi,vj)ecdivi,Vj)' 


(7) 
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Boolean  Function  lower'period(^3,  Xi,  p)  { 
in  Qs]  out  Xi]  inout  p; 
determine  5  and  t; 

=  s,  An  —  s, 

generate  the  periodic  cycle  graph  Qc  of  Qs  and  Aj; 
if  (increase(^c,  P,  Qt))  {  Xi  =  -oo;  return  ‘true’;  } 
generate  the  periodic  cycle  graph  Qc  of  Qs  and  A^; 
if  (!increase(^c,  P,  Qt))  {  return  ‘false’;  } 
loop  A  =  (Au  +  A/)/2; 

generate  the  periodic  cycle  graph  Qc  of  Qs  and  A; 
if  (!increase(^c,  p,  Gt))  {  Xi  —  X;  ]  else  {  An  =  A;  } 
if  (An  -  Xi  <  1/t^)  {  return  ‘true’;  } 
goto  loop; 


Table  2.  Function  lower-period 


Now,  we  can  introduce  Function  lower-period  in  Table  2  to  determine  the 
minimal  period  A/  for  which  a  periodic  min~max  potential  may  exist.  This  func¬ 
tion  is  based  on  binary  search,  see  also  [10],  [1]  and  uses  the  following  upper 
bound  t  for  the  sum  of  distances  in  any  simple  path  in  Qg: 


t  = 


E( 

vjev 


max  {|d(t)i,Vj)|}). 

(v{,vj)eE 


(8) 


The  correctness  of  Function  lower-period  is  addressed  in  Corollary  12.  In  the 
remaining  part  of  this  section  all  corollaries  and  theorems  are  given  without 
proofs  due  to  space  restrictions.  Regarding  the  proofs  the  interested  reader  is 
referred  to  [18]. 


Corollary  12.  If  Function  lower-period  returns  false)  then  the  dynamic  graph 
Qd  corresponding  to  Qs  has  no  periodic  min-max  potential.  Otherwise,  Qa  has 
min  potentials  for  all  k  '>  K  and  for  all  periods  X'>  Xi,  while  there  is  no  min- 
potential  for  all  periods  X  <  Xi.  ■ 


Similarly,  a  Function  upper-period  based  on  Function  decrease  is  used  to  de¬ 
termine  the  maximal  period  A^,  for  which  a  periodic  min-max  potential  may  ex¬ 
ist.  The  combination  of  both  functions  yields  an  algorithm  to  determine  whether 
there  are  periodic  min-max  potentials  for  a  dynamic  graph  Qd-  The  proof  is  a 
direct  consequence  of  Corollary  12,  its  counterpart  for  Function  upper-period  and 
Theorem  5. 


Theorem  13.  If  either  Function  lower-period  or  Function  upper-period  return 
false  ’  or  if  A/  >  A^^  is  produced,  then  there  is  no  periodic  min-max  potential  for 
the  dynamic  graph  Qd  corresponding  to  Qg.  Otherwise,  there  are  periodic  min- 
max  potentials  for  all  periods  Xi  <  X  <  Xu-  ■ 
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In  the  following  corollary  we  address  the  computational  complexity  for  the 
presented  method. 

Corollary  14.  There  is  an  algorithm  which  computes  a  periodic  min-max  po¬ 
tential  in  pseudo-polynomial  time.  ■ 

Finally,  a  result  on  the  uniqueness  of  a  period  A  is  derived. 

Theorem  15.  If  the  static  graph  Qs  corresponding  to  a  dynamic  graph  Gd  con¬ 
tains  only  edges  with  distance  >  0,  then  Gd  cither  has  no  periodic  min-max 
potential  or  a  min-max  potential  with  a  unique  period.  ■ 

Now,  we  can  return  to  the  main  task  of  addressing  the  general  case  of  different 
periods  associated  with  the  nodes  of  the  dynamic  graph.  In  order  to  simplify  the 
following  discussions,  we  suppose  that  there  is  no  directed  cycle  with  a  zero  sum 
of  distances  in  the  given  static  graph  Gs, 

d{vi,Vj)  >  0  for  all  directed  cycles  C  of  Gs- 

{vi  ,Vj)£C 

Remember  that  the  period  function  defines  a  partition  of  the  dynamic  graphs 
into  subgraphs  whose  nodes  have  equal  periods.  At  first,  these  subgraphs  are 
defined  formally.  This  is  done  using  the  weighted  bipartite  graph  ^  in  a  similar 
fashion  as  in  the  static  min-max  problem,  see  Section  2  and  Definition  1. 

Definition  16  Dominating  Subgraph.  A  dominating  subgraph  Gt  of  a  di¬ 
graph  G  (as  defined  in  Definition  1)  is  a  subgraph  of  G  with  the  following  prop¬ 
erties; 

1.  There  exists  a  min  potential  p  of  ^  which  is  a  min-max  potential  of  Gt^ 

2.  There  are  no  edges  {vi^Vj)  or  {vj,Vi)  with  Vi  e  and  Vj  G 


Next,  the  following  theorem  provides  results  on  one  step  of  a  procedure  which 
determines  the  quasi-periodic  min-max  potential  of  a  given  static  graph.  It  is 
shown  that  the  concatenation  of  Functions  lower-period  and  decrease 

-  peals  off  a  subgraph  of  a  given  static  graph, 

-  produces  a  period  Xmax  and  a  corresponding  min-max  potential  for  this 
subgraph  and 

-  that  the  remaining  static  graph  has  a  min  potential  for  a  period  less  than 
Xrnax  • 

Corollary  17,  Given  a  static  graph  Gs-  After  execution  of  Functions  lower- 
period  {Gs,  ^max,  p)  with  initial  potentials  p{vi)  =  0  and  decrease(^c,  P,  Gt) 
with  the  periodic  cycle  graph  Gc  corresponding  to  Gs  ctnd  Xmax,  i'he  following 
properties  hold: 
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1.  Qt  is  a  dominating  subgraph  of  Qc- 

2.  The  application  of  Function  lower-penod{{Q s\Qt),  A,  p)  returns  ‘true’  with 

A  ^max  • 


Now,  we  are  ready  to  present  the  complete  algorithm  for  the  calculation  of  the 
quasi-periodic  min-max  potential,  see  Table  3.  Input  to  the  Function  period(Qs, 
X{))  is  the  given  static  graph  Qs,  while  its  output  is  the  resulting  period  func¬ 
tion  A.  The  corresponding  min-max  potentials  can  be  either  extracted  during 
execution  of  Function  period  or  by  using  the  proof  of  Corollary  11. 


Boolean  Function  period(^5,  A())  { 
in  Qs]  out  A(); 

loop  p{vi)  =  0  for  all  Vi  e  Vs‘, 
lower-period(C?5,  Xmax,  p)\ 

generate  the  periodic  cycle  graph  Qc  of  Qs  and  Xmax\ 
if  (decrease(^c,  p,  Qt))  { 

X{vi)  =  Xmax  for  all  Vi  e  Vt; 
return  ‘true’;  } 

else  { 

X(^Vi)  —  XfTTiax  for  all  Vi  C  Fij 
Qs  =  Gs\Qt] 
goto  loop; 

} 

} 


Table  3.  Function  period 


Finally,  the  following  theorem  states  one  of  the  main  results  of  this  paper. 

Theorem  18.  Any  dynamic  graph  Qd  has  a  quasi-periodic  min-max  potential 
The  potential  is  unique.  ■ 

4  Conclusion 

In  this  paper,  we  demonstrate  a  close  relationship  between  static  and  dynamic 
min-max  problems.  Also,  pseudo  polynomial  algorithms  for  the  solution  of  min- 
max  equations  systems  in  the  quasi-periodic  and  non-periodic  case  are  presented. 
Further,  we  show  that  any  dynamic  graph  has  a  unique  period  function. 
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Abstract.  In  this  paper,  we  present  deterministic  parallel  algorithms 
for  the  coarse  grained  multicomputer  (CGM)  and  bulk-synchronous  par¬ 
allel  computer  (BSP)  models  which  solve  the  following  well  known  graph 
problems:  (1)  list  ranking,  (2)  Euler  tour  construction,  (3)  computing  the 
connected  components  and  spanning  forest,  (4)  lowest  common  ancestor 
preprocessing,  (5)  tree  contraction  and  expression  tree  evaluation,  (6) 
computing  an  ear  decomposition  or  open  ear  decomposition,  (7)  2-edge 
connectivity  and  biconnectivity  (testing  and  component  computation), 
and  (8)  cordal  graph  recognition  (finding  a  perfect  elimination  ordering). 
The  algorithms  for  Problems  1-7  require  O(logp)  communication  roimds 
and  linear  sequential  work  per  roimd.  Our  results  for  Problems  1  and  2 
hold  for  arbitrary  ratios  i.e.  they  are  fully  scalable,  and  for  Problems 
3-8  it  is  assumed  that  -  >  pL  e  >  0,  which  is  true  for  all  commercially 

P  J-  7  }  ^  ^ 

available  multiprocessors.  We  view  the  algorithms  presented  as  an  im¬ 
portant  step  towards  the  final  goal  of  0(1)  communication  roimds.  Note 
that,  the  number  of  communication  roimds  obtained  in  this  paper  is  in¬ 
dependent  of  n  and  grows  only  very  slowly  with  respect  to  p.  Hence, 
for  most  practical  purposes,  the  number  of  communication  rounds  can 
be  considered  as  constant.  The  result  for  Problem  1  is  a  considerable 
improvement  over  those  previously  reported.  The  algorithms  for  Prob¬ 
lems  2-7  are  the  first  practically  relevant  deterministic  parallel  algorithms 
for  these  problems  to  be  used  for  commercially  available  coarse  grained 
parallel  machines. 
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1  Introduction 

The  Models:  Speedup  results  for  theoretical  PRAM  algorithms  do  not  neces¬ 
sarily  match  the  speedups  observed  on  real  machines  [2]  [31].  Given  sufficient 
slackness  in  the  number  of  processors,  Valiant’s  BSP  approach  [34]  simulates 
PRAM  algorithms  optimally  on  distributed  memory  parallel  systems.  Valiant 
points  out,  however,  that  one  may  want  to  design  algorithms  that  utilize  local 
computations  and  minimize  global  operations  [33]  [34].  The  BSP  approach  re¬ 
quires  that  g  {—  local  computation  speed  /  router  bandwidth)  is  low,  or  fixed, 
even  for  increasing  number  of  processors.  Gerbessiotis  and  Valiant  [17]  describe 
circumstances  where  PRAM  simulations  can  not  be  performed  efficiently,  among 
others,  if  the  factor  g  is  high.  Unfortunately,  this  is  true  for  most  currently  avail¬ 
able  multiprocessors.  The  parallel  algorithms  presented  in  this  paper  consider 
this  ca.se  for  graph  problems. 

As  pointed  out  in  [34] ,  the  cost  of  a  message  also  contains  a  constant  overhead 
cost  s.  The  value  of  s  can  be  fairly  large  and  the  total  message  overhead  cost 
can  have  a  considerable  impact  on  the  speedup  observed  (see  e.g.  [8]).  We  are 
therefore  also  using  a  more  practical  version  of  the  BSP  model,  referred  to  as 
the  coarse  grained  multicomputer  model  (CGM)  [8],  [9],  [10].  It  is  comprised 
of  a  set  of  p  processors  Pi,...,Pp  with  0{n/p)  local  memory  per  processor 
and  an  arbitrary  communication  network  (or  shared  memory).  All  algorithms 
consist  of  alternating  local  computation  and  global  communication  rounds.  Each 
communication  round  consists  of  routing  a  single  h-relation  with  h  —  0{n/p), 

i.e.  each  processor  sends  0(n/p)  data  and  receives  0(n/p)  data.  We  require 
that  all  information  sent  from  a  given  processor  to  another  processor  in  one 
communication  round  is  packed  into  one  long  message,  thereby  minimizing  the 
message  overhead.  In  the  BSP  model,  a  computation/communication  round  is 
equivalent  to  a  superstep  with  L  —  ^g  (plus  the  above  “packing  requirement” ) . 

Finding  an  optimal  algorithm  in  the  coarse  grained  multicomputer  model 
(CGM)  is  equivalent  to  minimizing  the  number  of  communication  rounds  as  well 
as  the  total  local  computation  time.  This  considers  all  parameters  discussed  above 
that  are  affecting  the  final  observed  speedup  and  it  requires  no  assumption  on 
g.  Furthermore,  it  has  been  shown  that  minimizing  the  number  of  supersteps 
also  leads  to  improved  portability  across  different  parallel  architectures  ([33] 
[34]  [13]).  The  above  model  has  been  used  (explicitly  or  implicitly)  in  parallel 
algorithm  design  for  various  problems  ([4],  [8],  [9],  [14],  [12],  [22],  [10])  and  shown 
very  good  practical  timing  results. 

The  Results:  In  this  paper,  we  study  deterministic  parallel  graph  algorithms 
for  the  CGM  and  BSP  models.  We  consider  the  following  well  known  graph 
problems: 

1.  list  ranking 

2.  Euler  tour  construction 

3.  computing  the  connected  components  and  spanning  forest 

4.  lowest  common  ancestor  preprocessing 
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5.  tree  contraction  and  expression  tree  evaluation 

6.  computing  an  ear  decomposition  or  open  ear  decomposition 

7.  2-edge  connectivity  and  biconnectivity  (testing  and  component  computation) 

8.  cordal  graph  recognition,  finding  a  perfect  elimination  ordering 

These  problems  have  been  extensively  studied  for  the  PRAM  (see  e.g.  [28]) 
and  for  fine-grained  parallel  network  models  of  computation  (see  e.g.  [1]).  However, 
for  the  practically  much  more  relevant  CGM/BSP  model  there  exist,  to  the  best 
of  our  knowledge,  only  a  few  results  on  parallel  graph  algorithms. 

Reid-Miller’s  [27]  presented  an  empirical  study  of  parallel  list  ranking  for  the 
Cray  C-90.  The  paper  followed  essentially  the  CGM/BSP  model  and  claimed 
that  this  was  the  fastest  list  ranking  implementation  so  far.  The  algorithm  in  [27] 
required  O(logn)  communication  rounds.  In  [11],  an  improved  algorithm  was 
presented  which  required,  with  high  probability,  only  O(^’logp)  rounds,  where 
k  <  log*  n.  In  [13],  O(logp)  communication  rounds  are  achieved  by  a  random¬ 
ized  algorithm.  Baumker  and  Dittrich  [3]  presented  a  randomized  connected 
components  algorithm  for  planar  graphs  using  O(logp)  communication  rounds. 
They  suggest  an  extension  of  this  algorithm  for  general  graphs  with  the  same 
number  of  communication  rounds. 

We  improve  these  results  by  giving  the  first  deterministic  algorithms  for  list 
ranking  and  computing  connected  components  using  O(logp)  rounds.  This  im¬ 
provement  is  an  important  step  towards  the  ultimate  goal,  a  deterministic  al¬ 
gorithm  with  only  0(1)  communication  rounds.  In  fact,  it  is  an  open  problem 
whether  this  is  possible  for  these  graph  problems.  Algorithms  with  0(1)  rounds 
have  been  presented  for  various  Computational  Geometry  problems  [8,  9,  10,  11, 
16],  but  the  graph  problems  studied  in  this  paper  have  considerably  less  ''in¬ 
ternal  structure”  which  could  be  exploited  to  obtain  such  solutions.  Note  that, 
in  practice,  the  number  of  processors  is  usually  fixed.  In  contrast  to  the  previous 
deterministic  results,  the  improved  number  of  communication  rounds  obtained 
in  this  paper,  O(logp),  is  independent  of  n  and  grows  only  very  slowly  with 
respect  to  p.  Hence,  for  most  practical  purposes,  the  number  of  communication 
rounds  can  be  considered  as  constant.  We  expect,  that  this  will  be  of  considerable 
practical  relevance. 

As  in  [27]  we  will,  in  general,  assume  that  n  »  p  (coarse  grained),  because 
this  is  usually  the  case  in  practice.  Note,  however,  that  our  results  for  Problems 

1  and  2  hold  for  arbitrary  ratios  Goodrich  [18]  calls  such  algorithms  fully 
scalable.  For  Problems  3-8  we  will  assume  that  j  >  p\  c  >  0,  which  is  true  for 
all  commercially  available  multiprocessors. 

2  List  Ranking 

Let  L  be  a  list  represented  by  a  vector  s  s.t.  s[i]  is  the  node  following  i  in  the  list 
L.  The  last  element  I  of  the  list  L  is  the  one  with  s[l]  =  1.  The  distance  between 
i  and  j,  diii^j),  is  the  number  of  nodes  between  i  and  j  plus  1  (i.e.  the  distance 
is  0  iff  ?■  =  i,  and  it  is  one  if  and  only  if  one  node  follows  the  other).  The  list 
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ranking  problem  consists  of  computing  for  each  i  G  L  the  distance  between  i  and 
I,  referred  to  as  rankL{i)  =  dL(iJ). 

For  our  algorithm,  we  need  the  following  definitions.  A  r-ruUng  set  is  defined 
as  a  subset  of  selected  list  elements  that  has  the  following  properties:  (1)  No  two 
neighboring  elements  are  selected.  (2)  The  distance  of  any  unselected  element  to 
the  next  selected  element  is  at  most  r. 

An  overview  of  our  CGM  list  ranking  algorithm  is  as  follows.  First,  we  com¬ 
pute  a  0(p^)-ruling  set  R  with  \R\  =  0{n/p)  and  broadcast  R  to  all  processors. 
More  precisely,  the  0(p^)-ruling  set  R  is  represented  as  a  linked  list  where  each 
element  i  is  assigned  a  pointer  to  the  next  element  j  of  R  with  respect  to  the 
order  implied  by  L  as  well  as  the  distance  between  i  and  j  in  L.  Then,  every 
processor  sequentially  performs  a  list  ranking  of  R,  computing  for  each  i  E  R 
its  distance  to  the  last  element  of  L.  All  other  list  elements  have  at  most  dis¬ 
tance  0(p“)  from  the  next  element  of  R  in  the  list.  Their  distance  is  determined 
by  simulating  standard  PRAM  pointer  jumping  until  the  next  element  of  R  is 
reached . 

All  steps,  except  for  the  computation  of  the  0(p^)-ruling  set  R,  can  be  easily 
implemented  in  O(logp)  communication  rounds. 

In  the  remainder  of  this  section  we  introduce  a  new  technique,  called  determ¬ 
inistic  list  compression,  which  will  allows  us  to  compute  a  0(p^)-ruling  set  in 
O(logp)  communication  rounds. 

The  basic  idea  behind  deterministic  list  compression  is  to  have  an  alternating 
sequence  of  compress  and  concatenate  phases.  In  a  compress  phase,  we  select  a 
subset  of  list  elements,  and  in  a  concatenate  phase  we  use  pointer  jumping  to 
work  our  way  towards  building  a  linked  list  of  selected  elements. 

For  the  compress  phase,  we  apply  the  deterministic  coin  tossing  technique 
of  [7]  but  with  a  different  set  of  labels.  Instead  of  the  memory  address  used 
in  [7],  we  use  the  number  of  the  processor  storing  list  item  i  as  its  label  l(i). 
During  the  computation,  we  select  sequentially  the  elements  of  R  in  the  sublists 
of  subsequent  nodes  in  L  which  are  stored  at  the  same  processor.  The  term 
“subsequent”  refers  to  successor  with  respect  to  the  current  value  of  s. 

Note  that,  there  are  at  most  p  different  labels,  and  subsequent  nodes  in  those 
parts  of  L  that  are  not  processed  sequentially  have  different  labels.  We  call  list 
element  s[i]  a  local  maximum  if  l{i)  <  /(«[?])  >  /(s[sH])-  We  apply  deterministic 
coin  tossing  to  those  parts  of  L  that  are  not  processed  sequentially. 

The  naive  approach  of  applying  this  procedure  O(logp)  times  would  yield 
a  0(p^)-ruling  set,  but  unfortunately  it  would  require  more  than  O(logp)  com¬ 
munication  rounds.  Note  that,  when  we  want  to  apply  it  for  a  second,  third, 
etc.  time,  the  elements  selected  previously  need  to  be  linked  by  pointers.  Since 
two  subsequent  elements  selected  by  deterministic  coin  tossing  can  have  distance 
0{p),  this  may  require  O(logp)  communication  rounds,  each.  Hence,  this  straight 
forward  approach  requires  a  total  of  O(log^p)  communication  rounds. 

Notice,  however,  that  if  two  selected  elements  are  at  distance  &{p)  at  a  given 
moment,  then  it  is  unnecessary  to  further  apply  deterministic  coin  tossing  in  order 
to  reduce  the  number  of  selected  elements.  The  basic  approach  of  our  algorithm  is 
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therefore  to  interleave  pointer  jumping  and  deterministic  coin  tossing  operations 
with  respect  to  our  new  labeling  scheme.  More  precisely,  we  will  have  only  one 
pointer  jumping  step  between  subsequent  deterministic  coin  tossing  steps,  and 
such  pointer  jumping  operations  will  not  be  applied  to  those  list  elements  that 
are  pointing  to  selected  elements. 

This  concludes  the  high  level  overview  of  our  deterministic  list  compression 
techniques.  The  following  describes  the  algorithm  in  detail. 

Algorithm  1  CGM  Algorithm  for  computing  a  p^-ruling  set. 

Input:  A  linked  list  L  and  a  vector  s  where  s[z]  is  the  node  following  i  in  the 
list  L.  L  and  s  are  stored  on  a  p  processor  CGM  with  total  0(n)  memory. 
Output:  A  set  of  selected  nodes  of  L  (which  is  a  p^-ruling  set). 

(1)  Mark  all  list  elements  as  not  selected. 

(2)  FOR  EVERY  list  element  i  IN  PARALLEL: 

IF  l{i)  <  >  /(s[sp]])  THEN  mark  s[i]  as  selected. 

(3)  Sequentially,  at  each  processor,  process  the  sublists  of  subsequent  list  ele¬ 
ments  which  are  stored  at  the  same  processor.  For  each  such  sublist,  mark 
every  second  element  as  selected.  If  a  sublist  has  only  two  elements,  and  not 
both  neighbors  have  a  smaller  label,  then  mark  both  elements  of  the  sublist 
as  not  selected. 

(4)  FORk  =  l...logpDO 

(4.1)  FOR  EVERY  list  element  i  IN  PARALLEL: 

IF  s[i]  is  not  selected  THEN  set  s[i]  :=  s[5[f]]. 

(4.2)  FOR  EVERY  list  element  i  IN  PARALLEL: 

IF  (?',  5[f]  and  ^[sp]]  are  selected)  AND  NOT  (/(?’)  <  ^(^’H)  > 
lis[s[i]]))  AND  (l(i)  ^  l{s[{\))  AND  (/(^H)  ^  /(44fl]))  THEN  mark 
SiS  not  selected. 

(4.3)  Sequentially,  at  each  processor,  process  the  sublists  of  subsequent  selec¬ 
ted  list  elements  which  are  stored  at  the  same  processor.  For  each  such 
sublist,  mark  every  second  selected  element  as  not  selected.  If  a  sublist 
has  only  two  elements,  and  not  both  neighbors  have  a  smaller  label,  then 
mark  both  elements  of  the  sublist  as  not  selected. 

(5)  Select  the  last  element  of  L. 

—  End  of  Algorithm  — 

We  first  prove  that  the  set  of  elements  selected  at  the  end  of  Algorithm  1  is 
of  size  at  most  0{n/p). 

Lemma  1.  After  the  iteration  in  Step  4,  there  are  no  more  than  two  selected 
elements  among  any  2^  subsequent  elements  of  the  original  list  L. 
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Proof.  Due  to  space  limitations,  the  proof  is  omitted.  It  can  be  found  in  the  full 
version  of  this  paper  [5]. 

In  order  to  show  that  subsequent  elements  selected  at  the  end  of  Algorithm  1 
have  distance  at  most  0(p“),  we  need  the  following  lemmas. 

Lemma  2.  After  every  execution  of  Step  f.3,  the  distance  of  two  subsequent 
selected  elements  with  respect  to  the  current  pointers  (represented  by  vector  s)  is 
at  most  0{p). 

Proof  Due  to  space  limitations,  the  proof  is  omitted.  It  can  be  found  in  the  full 
version  of  this  paper  [5]. 

Lemma  3.  After  the  k~th  execution  of  Step  f.S,  two  subsequent  elements  with 
respect  to  the  current  pointers  (represented  by  vector  s)  have  distance  0(2^)  with 
respect  to  the  original  list  L. 

Proof.  Obvious  consequence  of  the  fact  that  only  k  pointer  jumping  operations 
were  so  far  executed  in  Step  4.1. 

Lemma  4.  No  two  subsequent  selected  elements  have  a  distance  of  more  than 
0{p^)  with  respect  to  the  original  list  L. 

Proof  Follows  from  Lemma  2  and  Lemma  3. 

In  summary,  we  obtain 

Theorems.  The  list  ivnking  problem  for  a  linked  list  with  n  vertices  can  be 
solved  on  a  CGM  with  p  processors  and  0(~)  local  memory  per  processor  using 
O(\ogp)  communication  rounds  and  0{j)  local  computation  per  round. 


3  Euler  Tour  in  a  Tree 

Let  T  =  (V,  E]  he  an  undirected  tree  and  T*  =  {V,  E*)  he  di  directed  graph  with 
E*  =  {(c,  ic),  (ic,  c)|{u,  ic)  G  E}.  Thus,  T*  is  Eulerian  because  indegree{v)  - 
outdegree{v)  for  each  vertex  v.  The  Euler  Tour  problem  for  T  consists  of  com¬ 
puting  for  T*  a  path  that  traverses  each  edge  exactly  once  and  returns  to  its 
starting  point,  as  well  as  for  each  vertex  its  rank  in  this  path. 

Theorem  6.  The  Euler  Tour  of  a  tree  T  with  n  vertices  can  be  computed  on 
a  CGM  with  p  processors  and  O(^)  local  memory  per  processor  using  O(logp) 
communication  rounds  and  O(^)  local  computation  per  round. 

Proof  Due  to  space  limitations,  the  algorithm  and  proof  are  omitted.  They  can 
be  found  in  the  full  version  of  this  paper  [5] . 
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4  Connected  Components  and  Spanning  Forest 

Consider  an  undirected  graph  G  =  {y,E)  with  n  vertices  and  m  edges.  Each 
vertex  v  £  V  has  a  unique  label  between  1  and  n.  Two  vertices  u  and  v  are 
connected  if  there  is  an  undirected  path  of  edges  from  u  to  i;.  A  connected 
subset  of  vertices  is  a  subset  of  vertices  where  each  pair  of  vertices  is  connected. 
A  connected  component  of  G  is  defined  as  a  maximal  connected  subset. 

In  this  section,  we  study  the  problem  of  computing  the  connected  compon¬ 
ents  of  G  on  a  COM  with  p  processors  and  local  memory  per  processor. 

We  introduce  a  new  technique,  called  clipping,  which  refers  to  the  idea  of  tak¬ 
ing  a  PRAM  algorithm  for  the  same  problem  but  running  it  for  only  G(logp) 
rounds  and  then  finishing  the  computation  with  some  other  G(logp)  rounds  CGM 
algorithm.  (See  also  JaJa’s  accelerated  cascading  technique  for  the  PRAM  [19].) 

Steps  1  and  2  of  Algorithm  2  simulate  Shiloch  and  Vishkin’s  PRAM  algorithm 
[30],  but  for  logp  phases  only.  Each  vertex  v  has  a  pointer  to  a  vertex  parent{v) 
such  that  the  parent{v)  pointers  always  form  trees.  The  trees  are  also  referred 
to  as  a  supervertices.  A  tree  of  height  one  is  called  a  star.  An  edge  {u,  v)  is 
live  if  parent{u)  ^  parent(v).  Shiloch  and  Vishkin’s  PRAM  algorithm  merges 
supervertices  along  live  edges  until  they  equal  the  connected  components.  When 
simulated  on  a  CGM  or  BSP  computer,  Shiloch  and  Vishkin’s  PRAM  algorithm 
results  in  logn  communication  rounds  or  supersteps,  respectively. 

Our  CGM  algorithm  requires  O(logp)  rounds  only.  It  simulates  only  the 
first  logp  iterations  of  the  main  loop  in  the  PRAM  algorithm  by  Shiloch  and 
Vishkin  and  then  completes  the  computation  in  another  logp  communication 
rounds  (Steps  3-7). 

Algorithm  2  CGM  Algorithm  for  Connected  Component  Computation 
Input:  An  undirected  graph  G  =  {V,  E)  with  n  vertices  and  m  edges  stored 
on  a  p  processor  CGM  with  total  G(n  +  m)  memory.  Output:  The  connected 
components  of  G  represented  by  the  the  values  parent{v)  for  all  vertices  v  eV. 

(1)  FOR  all  u  G  y  IN  PARALLEL  DO  pareni(v)  :=  u. 

(2)  FOR  k  :=  1  to  logp  DO 

(2.1)  FOR  all  G  P  IN  PARALLEL  DO  pareni{v)  :=  parent  {parent  (v)). 

(2.2)  FOR  every  live  edge  (u,u)  IN  PARALLEL  DO  (simulating  concurrent 
write) 

(a)  IF  parent  {parent  {v))  =  parent  {v)  AND  par  ent{parent{u))  =  parent  {u) 
THEN  {  IF  parent{u)  >  parent{v)  THEN  parent  {par  ent{u)) 
parent{v)  ELSE  parent  {par ent{v))  \=.  parent  (u)  } 

(b)  IF  parent{u)  =  parent  {par  ent{u))  AND  parent{u)  did  not  get  new 
links  in  steps  2.1  and  2.2(a)  THEN  parent  {par  ent{u))  :=  parent  {v) 

(c)  IF  parent{v)  =  parent  {par  ent{v))  AND  pareni{v)  did  not  get  new 
links  in  steps  2.1  and  2.2.1  THEN  parent  {par  ent{v))  \=  parent  {u) 

(2.3)  FOR  all  G  P  IN  PARALLEL  DO  parent{v)  :=  parent  {parent  {v)). 
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(3)  Use  the  Euler  Tour  algorithm  in  Section  3  to  convert  all  trees  into  stars. 
For  each  v  G  V ,  set  pa7^ent(v)  to  be  the  root  of  the  star  containing  v.  Let 
Qf  _  the  graph  consisting  of  the  supervertices  and  live  edges 

obtained.  Distribute  G'  such  that  each  processor  stores  the  entire  set  U' 
and  a  subset  of  ^  edges  of  E' .  Let  Ei  be  the  edges  stored  at  processor 
0  <  i  <  p  —  I . 


(4)  Mark  all  processors  as  active. 

(5)  FOR  k  :=  1  to  logp  DO 

(5.1)  Partition  the  active  processors  into  groups  of  size  two. 

(5.2)  FOR  each  group  Pi,  Pj  of  active  processors,  i  <  j  IN  PARALLEL  DO 

(a)  processor  Pj  sends  it’s  edge  set  Ej  to  processor  Pi. 

(b)  processor  Pj  is  marked  as  passive. 

(c)  processor  Pi  computes  the  spanning  forest  {V^,Es)  of  the  graph 
SE  =  {V',  Ei  U  Ej)  and  sets  Ei  Eg. 

(6)  Mark  all  processors  as  active  and  broadcast  Eq. 

(7)  Each  processor  i  computes  sequentially  the  connected  components  of  the 

graph  G"  =  {V' ,Eo).  For  each  vertex  v  of  V'  let  parent' (v)  be  the  smallest 
label  parent{w)  of  a  vertex  iv  G  V'  which  is  in  the  same  connected  component 
with  respect  to  G"  =  (V ,Eq).  For  each  vertex  u  G  V  stored  at  processor  Pi 
set  parent{ii)  parent' {parent (u)) .  (Note  that  parent{u)  G  V' •) 

—  End  of  Algorithm  — 

Lemma  7.  [30]  The  number  of  different  trees  after  iteration  k  of  Step  2  is 
bounded  by  (|)^'n.. 

We  obtain 

Theorems.  Algorithm  2  computes  the  connected  components  and  spanning  forest 
of  a  graph  G  —  {V,E)  with  n  vertices  and  m  edges  on  a  CGM  with  p  processors 
and  0("^--)  local  memory  per  processor,  (^  >  0);  using  O(Iogp) 

communication  rounds  and  )  local  computation  per  round. 

Proof.  Due  to  space  limitations,  the  proof  is  omitted.  It  can  be  found  in  the  full 
version  of  this  paper  [5] . 

5  Other  Graph  Problems 

In  the  remainder,  we  summarize  our  solutions  for  Problems  4-8.  Due  to  space 
limitations,  the  algorithms  and  proofs  are  omitted.  They  can  be  found  in  the  full 
version  of  this  paper  [5]. 
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Lowest  Common  Ancestor:  The  lowest  common  ancestor,LCA{u,v),  of  two 
vertices  u  and  of  a  rooted  tree  T  =  [V^E)  is  the  vertex  w  that  is  an  ancestor 
to  both  u  and  v,  and  is  farthest  from  the  root.  The  problem  of  preprocessing  T 
in  order  to  answer  a  query  LCA{u,v)  quickly  for  any  pair  (u,  t;)  is  called  the 
lowest -common- ancestor  (LCA)  problem. 

Theorem  9.  Consider  a  rooted  tree  T  =  (V,  E)  with  n  vertices.  The  LCA  prob¬ 
lem  can  be  solved  on  a  CGM  with  p  processors  and  0{j)  local  memory  per 
processor  using  O(logp)  communication  rounds  and  0{j)  local  computation  per 
round. 


Ti'ee  Contraction  and  Expression  Tree  Evaluation:  We  observe  that  the 
classical  tree  contraction  and  expression  tree  evaluation  algorithm  of  [24]  can  be 
easily  implemented  on  a  CGM  to  run  in  O(logp)  communication  rounds. 

Observation  1  Tree  contraction  and  expression  tree  evaluation  on  a  tree  T  with 
n  nodes  can  be  performed  on  a  CGM  with  p  processors  and  0{j)  local  memory 
per  processor,  ^  >  p^  [O  0),  using  O(logp)  communication  rounds  and  0{~) 
local  computation  per  round. 


Open  Ear  Decomposition  and  Biconnected  Components:  Consider  an 
undirected  graph  G  =  (V,E)  with  n  vertices  and  m  edges.  For  the  remainder,  we 
assume  that  G  is  connected.  An  ear  decomposition  of  G  is  an  ordered  partition  of 
E  into  r  simple  paths  Pi, .  .  . ,  such  that  Pi  is  a  cycle,  and,  for  each  2  <  i  <  r. 
Pi  is  a  simple  path  with  endpoints  belonging  to  Pi  U  .  . .  U  Pj_i  but  with  none 
of  its  internal  vertices  belonging  to  Pj,  j  <  i.  The  paths  Pj  are  called  ears.  If 
none  of  the  Pi,i  >  1,  is  a  cycle,  then  the  decomposition  is  called  an  open  ear 
decomposition.  For  an  edge  e  in  Pi,  let  i  be  the  ear  number  of  e.  An  edge  e  ^  E 
is  a  cut-edge  if  e  does  not  lie  on  a  cycle  in  G.  A  connected  undirected  G  is 
2-edge  connected  if  it  contains  no  cut-edge.  G  has  an  ear  decomposition  if  and 
only  if  G  is  2-edge  connected.  A  cut-vertex  is  a  vertex  whose  removal  leaves  G 
disconnected.  G  is  biconnected  if  it  contains  at  least  three  vertices  and  has  no 
cut-vertex. 

Theorem  10.  For  a  graph  G  =  (V,  E)  with  n  vertices  and  m  edges,  the  ear  de¬ 
composition,  open  ear  decomposition,  as  well  as  its  2-edge  connected  and  bicon¬ 
nected  components  can  be  computed  on  a  CGM  with  p  processors  and 
local  memory  per  processor  using  0{\ogp)  communication  rounds  and  O(^)  local 
computation  per  round. 


Chordal  Graph  Recognition:  A  graph  G  —  [V,  E)  is  chordal,  if  every  cycle  of 
length  greater  than  three  has  a  chord,  i.e.,  an  edge  connecting  two  non-consecutive 
nodes  of  the  cycle.  A  simplicial  node  is  a  node  whose  neighbors  form  a  clique. 
Dirac  [15]  showed  that  every  chordal  graph  has  a  simplicial  node.  It  is  easy  to 
see  that  removing  an  arbitrary  node  from  a  chordal  graph  yields  another  chordal 
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graph.  Therefore,  after  removing  the  simplicial  node  of  a  chordal  graph,  the  new 
graph  has  another  simplicial  node.  Successively  removing  all  simplicial  nodes 
gives  an  ordering  of  the  nodes  of  G.  This  ordering  is  called  perfect  elimination 
ordering  [PEO). 

Theorem  11.  Finding  the  PEO  of  a  given  graph  G  -  {V,  E)  with  n  vertices  and 
m  edges  can  be  solved  on  a  CGM  with  p  processors  and  local  memory 

per  processor,  >  //  (e  >  0),  O(lognlogp)  communication  rounds  and 

local  computation  per  round. 
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Abstract.  We  construct  a  scheme  for  private  information  retrieval  with 
k  databases  and  communication  complexity 


1  Introduction 

Much  attention  has  been  given  to  the  problem  of  protecting  a  database  from  a 
user  that  tries  to  retrieve  an  information  that  he  is  not  allowed  to  access  [2,  8,  12]. 

In  some  scenarios,  an  opposite  problem  can  appear:  a  user  wishes  to  retrieve 
some  infomation  from  a  database  without  revealing  to  the  database  what  infor¬ 
mation  he  needs.  For  example[7],  an  investor  wishes  to  receive  an  information 
about  a  certain  stock  but  he  does  not  wish  others  (even  the  database)  to  know 
in  which  particular  stock  he  is  interested. 

However,  there  is  only  one  way  to  reach  a  complete  privacy:  the  user  should 
ask  for  the  copy  of  the  entire  database.  Otherwise,  the  database  will  get  some 
information  what  the  user  wishes  to  know.  This  is  not  a  good  solution  because 
it  requires  much  time  and  much  communication  from  the  database  to  the  user. 

If  there  are  several  identical  copies  of  the  database,  an  another  scenario  is 
possible[7]: 

The  user  asks  a  query  to  each  database  and  combines  the  results  of  the 
queries,  obtaining  the  desired  information.  Each  query  alone  gives  no  information 
what  the  user  is  interested  in. 

Chor,  Coldreich,  Kushilevitz,  Sudan[7]  introduced  this  model  and  constructed 
several  schemes  for  a  private  retrieval  of  one  bit  from  a  database: 

1.  A  scheme  for  2  databases  with  communication,  {n  is  the  size  of  the 

database) 

2.  A  scheme  for  k  databases  with  0(n^/*)  communication. 

3.  A  scheme  for  O(logn)  databases  with  0(log^  nloglogn)  communication. 

In  this  paper,  we  improve  their  result,  constructing  a  protocol  for  k  databases 
with  communication. 

*  The  author  was  supported  by  Latvia  Science  Council  Grant  96.0282  and  scholarship 
”SWH  Izglrtibai,  Zinatnei  un  Kulturai”  from  Latvia  Education  Foundation 
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Related  work.  Protocols  for  private  information  retrieval  in  [7]  and  this 
paper  have  used  ideas  from  several  related  problems  (instance  hiding  and  mul¬ 
tiparty  communication  complexity). 

Instance  hiding[l,  5,  6]  is  the  problem  of  obtaining  the  bit  from  the  or¬ 
acle  so  that  i  remains  secret.  There  are  some  similarities  and  some  substantial 
differences  between  instance  hiding  and  private  information  retrieval  (see  [7]  for 
more  detailed  discussion). 

Techniques  from  instance  hiding  were  relevant  to  protocols  for  private  infor¬ 
mation  retrieval  in  [7].  However,  they  are  not  used  in  this  paper. 

Multiparty  communication  complexity  is  also  related  to  private  information 
retrieval.  Pudlak,  Rodl,  Sgall[ll]  and  Ambainis[3]  have  considered  the  problem 
of  computing  where  a;  is  a  string  of  n  bits  and  i^j  are  integers  in  the 

following  model: 

Player  1  knows  x,  i,  Player  2  knows  x,j.  Each  of  them  sends  one  message  to 
Player  3.  Player  3  computes  the  result,  using  only  the  messages  received  from 
Players  1  and  2. 

Any  protocol  for  the  above  problem  can  be  easily  transformed  into  protocol 
for  private  information  retrieval.  Thus,  we  can  obtain  nontrivial  protocols  for 
private  information  retrieval  with  o(n)  communication. 

Another  communication  complexity  problem  was  studied  by  Babai,  Kimmel 
and  Lokam[4].  It  also  can  be  applied  to  private  information  retrieval. 

However,  all  these  protocols  are  less  efficient  than  the  protocols  for  private 
information  retrieval  designed  in  [7].  Still,  the  ideas  from  [3,  4,  11]  (not  explicit 
protocols)  can  be  useful  in  the  study  of  private  information  retrieval.  In  partic¬ 
ular,  this  paper  is  based  on  the  idea  of  combining  two  protocols  which  appeared 
in  the  setting  of  multiparty  communication  complexity  [3,  11]. 

2  Model 

Formally,  we  view  the  database  as  a  string  x  consisting  of  n  bits,  k  denotes  the 
number  of  identical  databases.  We  assume  that  the  user  wishes  to  retrieve  a 
single  bit  Xi  from  the  database. 

We  require  that,  for  every  database,  indices  i,  j  and  any  message  from  the 
user,  the  probability  of  the  database  receiving  this  message  is  equal  when  the 
user  retrieves  the  bit  and  when  the  user  retrieves  the  bit.  This  means 
that  database  does  not  get  any  information  about  i. 

There  are  several  extensions  of  this  model.  [7]  considered  schemes  which  allow 
to  retrieve  blocks  of  information  and  give  a  higher  degree  of  privacy  (knowing 
k  —  1  of  k  queries  gives  no  information  about  the  bit  that  the  user  retrieves). 
Ostrovsky  and  Shoup[9]  have  extended  the  results  of  [7]  and  designed  schemes 
for  private  information  storage.  Using  their  schemes,  the  user  can  both  read  and 
write  to  the  database  without  revealing  which  bit  is  accessed.  They  have  shown 
that  any  protocol  for  private  information  retrieval  can  be  transformed  to  the 
protocol  for  private  information  storage  with  a  slight  increase  in  the  number  of 
databases  and  communication. 
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However,  in  this  paper,  we  consider  only  the  basic  one-bit  model  of  [7]. 

S  0  z  denotes  5  U  {i},  lii  ^  S  and  S  —  {z}  if  z  E  5, 

3  Result 

Consider  some  protocol  for  private  information  retrieval.  Does  the  user  use  all 
bits  in  the  messages  from  the  databases?  In  some  protocols,  only  a  few  bits  are 
really  neccessary.  If  the  user  knows  in  advance  which  bits  are  necessary,  two 
protocols  can  be  combined,  obtaining  the  third  with  more  databases  and  less 
communication. 

Below,  we  show  how  to  combine  a  protocol  for  2  databases  and  a  protocol  for 
k  -  I  databases,  obtaining  a  protocol  for  k  databases  with  less  communication. 

1.  The  user  in  the  k  database  protocol  simulates  the  user  in  the  protocol  for 
2  databases.  Let  Xi  denote  the  message  sent  to  the  D*  database  and  X2  the 
message  sent  to  the  2"^  database  in  the  2  database  protocol. 

The  user  sends  Xi  to  the  database  and  X2  to  the  2"^,  . . the  k^^  database. 

2.  Then,  the  user  computes  the  length  of  the  reply  from  the  2"^  database  in  the 
2  database  protocol  and  the  positions  of  necessary  bits  in  this  reply.  Further, 
m  denotes  the  length  and  ni,...,ni  denote  the  positions  of  the  necessary 
bits. 

The  user  simulates  the  user  in  the  protocols  for  Aj  —  1  databases  where 
. . .,  bits  from  an  m  bit  database  are  retrieved,  sending  to  the  {i  -i-  1)®* 
database  the  messages  which  are  sent  to  the  database  in  the  (A;  —  1) 
database  protocol. 

3.  The  1®*  database  simulates  the  1®‘  database  in  2  database  protocol  and  sends 
the  user  the  same  message. 

4.  The  2”"^,  . . .,  the  A:^^  database  simulate  the  2^^^  database  in  the  2  database 
protocol.  Instead  of  sending  the  message  to  the  user,  they  consider  it  as  a 
new  77i~bit  database. 

Further,  they  simulate  databases  in  the  (A;  —  1)  database  protocol  for  the  re¬ 
trieval  of  the  . . .,  the  bit  and  send  the  messages  from  these  protocols 
to  the  user. 

5.  The  user  simulates  the  user  in  the  (A;  —  1)  database  protocol  for  the  retrieval 
of  the  . . .,  the  nf^  bits.  Then,  knowing  the  message  from  the  1®*  database 
and  all  the  necessary  bits  from  the  second  message,  the  user  simulates  the 
user  in  the  2  database  protocol.  The  result  of  this  simulation  is  the  bit  that 
the  user  wishes  to  retrieve. 

If  we  wish  to  apply  this  idea,  2  database  protocol  should  satisfy  certain 
constraints: 

1.  The  most  of  communication  goes  from  the  databases  to  the  user.  (The 
amount  of  communication  from  the  user  to  databases  increases  when  two 
protocols  are  combined.  Hence,  if  it  is  already  large,  the  combination  is  use¬ 
less.) 
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2.  Only  few  bits  from  the  messages  received  by  the  user  are  necessary. 

3.  The  user  knows  in  advance  which  bits  are  necessary,  i.e.  the  positions  of 
these  bits  do  not  depend  on  the  databases’  contents. 

Below,  we  use  the  idea  of  combining  two  protocols  to  prove 

Theorem  1.  Let  k>2.  There  exists  a  protocol  for  private  information  retrieval 
with  k  databases  and  bits  of  communication. 

Proof.  By  induction. 

The  protocol  for  2  databases  was  constructed  by  Chor,  Goldreich,  Kushilevitz 
and  Sudan  [7].  The  protocol  for  k  databases  is  obtained  as  the  combination  of 
the  protocols  for  2  databases  and  (A;  —  1)  databases. 

First,  we  describe  the  2  database  protocol  that  we  use  to  obtain  a  k  database 
protocol  from  a  (Aj  —  1)  database  protocol. 

1.  Let  /  =  f  The  database  can  be  considered  as  a  2A;  -  1  dimensional 

cube  {0,...,/  -  Each  position  i  G  {0, . . . ,  n  -  1}  in  the  database 

coresponds  to  some  position  (zi, . . .,  i2k-i)  the  cube. 

The  user  chooses  independently  (2A;  -  1)  random  subsets  of  {0, . . . ,  /  -  1}:  , 

. . .,  Let  Si  =  ©zi,  ...,  where  (n, .  ..,^2^-1) 

is  the  position  of  the  required  bit  in  the  (2A!  —  1)  dimensional  cube. 

He  sends  •S'l , . . . , database  and  , . . . ,  to  the  2"^ 

database. 

2.  The  1®^  database  computes  the  exclusive-or  of  the  bits  in  positions  (ji,  . . ., 

i2ib-i)  such  that  ji  e  Si,  .. j2k-i  €  and  sends  it  to  the  user. 

The  database  also  computes  the  exclusive-or  of  the  bits  in  positions  {ji,  . . ., 
j2k-i)  such  that  ji  e  S[,  .. .,  j2k^i  ^  possible  S[,  . . ., 

such  that 

(a)  Sj  =  Sj  ®t  for  some  j  G  {1, . . . ,  2A;  -  1}  and  t  G  {0, 1}; 

(b)  S-  =  Sf  for  all  i  j. 

The  exclusive-xor  for  each  possible  5^,  . . .,  is  sent  to  the  user,  too. 

3.  The  2"^^  database  computes  the  exclusive-or  of  the  bits  in  positions  {ji,  . . ., 

j2k~i)  such  that  ji  e  Si,  .. .,  j2k-i  G  and  sends  it  to  the  user. 

Further,  the  2*^^  database  computes  the  exclusive-or  of  the  bits  in  positions 
(ji,  ■ .  such  that  ji  eS[,.. .,  j2k-i  e  S'2k-i  for  each  possible  5i,  . . ., 

(a)  For  each  i  G  {1, . . . ,  2A;  —  1}  S'-  is  equal  to  Sf  or  Sf  ©  U  for  some  U  G 
{0,...,!-!}; 

(b)  There  exist  at  least  two  iG{l,..-,2A;  —  1}  such  that  5^  ==  Sf . 

The  exclusive-xor  for  each  possible  5^,  . . .,  S2^_-^  is  sent  to  the  user,  too. 

4.  For  each  possible  5^,  . . .,  S2^_-^  such  that  S[  is  either  Sl  or  Sf,  the  user 
finds  the  exclusive-or  of  bits  in  positions  (ji,  •  •  • , i2Jk-i)  satisfying  ji  G  S'l, 

. . i2*-i  G 

(a)  If  S[  =  sf  for  at  most  one  i,  then  the  exclusive-or  is  one  of  the  bits  sent 
by  the  1®*  database. 
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(b)  If  5^'  =  Sf  for  at  least  two  i,  then  the  exclusive-or  is  one  of  the  bits  sent 
by  the  2"^  database. 

The  user  computes  the  exclusive-or  of  all  these  values.  It  is  the  necessary  bit 
from  the  database. 

(5?  =  5|  ©Zj.  Hence,  ij  belongs  to  exactly  one  of  and  Sj  and  ^  S[, 

•  •  •,  hk-i  G  for  exactly  one  choice  of  5^, . . ., 

For  each  other  position  (ii, . .  ^  •  •  •>  ^2k-i  ^  ^2k-i 

for  an  even  number  (possibly  zero)  of  combinations  5j, . . . , 

Hence,  the  exclusive-or  computed  by  the  user  contains  the  bit  in  the  position 
(Hj  •  •  exactly  once  and  any  other  bit  an  even  number  of  times.  It 

follows  that  this  exclusive-or  is  equal  to  the  bit  in  the  position  (ii, . . . , 

i.e.  the  bit  that  the  user  wishes  to  retrieve.) 

The  amount  of  transmitted  bits. 

1.  Communication  from  the  user  to  the  databases. 

To  transmit  a  set  the  user  needs  I  =  bits.  (For  each  a;  C  {0, . . . ,  /  — 

1},  the  user  must  say  whether  x  ^  Sj.)  The  user  transmits  2k  —  1  sets  (5j, 

. . .,  5'2ji._i)  to  the  database  and  2k  —  1  sets  to  the  2"^^  database. 

So,  the  total  amount  of  communication  in  this  direction  is  2[2k  —  1)  = 

o(  ="-^). 

2.  Communication  from  the  1®*  database  to  the  user. 

The  1®*  database  computes  the  exclusive-or  of  the  bits  for  several  combina¬ 
tions  of  5^, . . . ,  and  sends  it  to  the  user.  The  amount  of  bits  trans¬ 

mitted  by  the  1®*  database  is  equal  to  the  number  of  the  combinations  of 
S[^ . . . ,  i.e.  {2k  ~  1)1 1. 

Aj  is  a  constant  and  I  =  [  .  Hence,  the  amount  of  communication  in 

this  direction  is  0(  ^'‘“•^n),  too. 

3.  Communication  from  the  2""^  database  to  the  user. 

Similarly  to  the  previous  case,  the  amount  of  bits  transmitted  by  the  2”^^ 
database  is  equal  to  the  number  of  combinations  5^ , . . . ,  5'2jt_i. 

For  the  2^*^  database,  the  amount  of  such  combinations  is  at  most  (2^*“^  - 
2k)P^~^  =  because: 

(a)  Those  i  for  which  5^-  ^  Sf  form  a  subset  of  {1, . . . ,  2A;  —  1}  with  at  most 
2k  ~  3  elements.  (For  at  least  two  i  G  {1, . . . ,  2A;  —  1},  Sf  =  5^.) 

The  amount  of  such  subsets  is  2^^“^  —  2 A;. 

(b)  If  we  have  chosen  i  for  which  Sf  /  5^-,  it  remains  to  choose  ti.  There  are 
I  possible  values  of  ti  for  each  i. 

ti  is  chosen  for  at  most  2k  —  3  values  of  i.  Hence,  there  are  at  most 
possible  combinations  of  U. 

So,  the  user  transmits  0{  ^''“■y/n)  bits,  the  1®^  database  0(  bits  and  the 

2"^^  database  bits. 

From  the  2"^^  database’s  answer  the  user  needs  a  constant  amount  (2^^"^— 2A:) 
of  bits.  The  positions  of  these  bits  in  the  message  from  the  2’^'^  database  do  not 
depend  on  the  contents  of  the  database. 
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Hence,  we  can  combine  the  described  protocol  with  a  (fe  —  l)  database  pro¬ 
tocol,  using  the  method  described  at  the  beginning  of  this  section. 
Communication  in  the  k  database  protocol. 

1.  Communication  from  the  user  to  the  databases. 

The  user  sends  to  the  databases; 

(a)  The  information  from  the  2  database  protocol:  0(  bits  to  each 

database. 

(b)  The  information  for  the  simulations  of  the  (A;  —  1)  database  protocol: 

0(  bits  where  m  is  the  length  of  the  message  from  the  2"^^ 

database  in  the  2  database  protocol.  We  have 

Hence,  0{  bits  are  transmitted  for  this  purpose. 

2.  Communication  from  the  1®*  database  to  the  user.  It  is  the  same  as  in  the  2 

database  protocol,  i.e.  0(  bits. 

3.  Communication  from  the  2"^^,  . . .,  the  database  to  the  user. 

In  each  simulation  of  (A;  — 1)  database  protocol,  these  databases  communicate 

0(  =  0(  “■\/n(2*-3)/(2*-i))  =  0( 

bits.  The  amount  of  simulations  performed  by  the  databases  is  equal  to  the 
amount  of  bits  needed  by  the  user  from  the  2^^^  database’s  message,  i.e. 
constant.  Hence,  the  communication  by  these  databases  is  0{  too. 

We  have  constructed  a  protocol  with  k  databases  and  0{  communication 

from  a  protocol  with  (A;  -  1)  databases  and  0{  communication. 

Using  the  construction  of  Ostrovsky  and  Shoup[9]  and  the  protocol  described 
above,  we  can  obtain  a  scheme  in  which  both  reading  and  writing  are  private. 
This  scheme  has  (A;  +  1)  databases  and  logn)  communication  com¬ 

plexity  for  any  k  >  2. 
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Computation  Paths  Logic: 

An  Expressive,  yet  Elementary,  Process  Logic 

(abridged  version) 


David  Harel  and  Eli  Singerman 
The  Weizmann  Institute  of  Science,  Rehovot,  ISRAEL 

Abstract:  A  new  process  logic,  is  defined,  called  computation  paths  logic  (CPL), 
which  treats  formulas  and  programs  essentially  alike.  CPL  is  a  pathwise  extension  of 
PDL  in  the  spirit  of  the  logic  R  of  Harel  and  Peleg.  It  enjoys  most  of  the  advantages 
of  ])revious  process  logics,  yet  is  decidable  in  elementary  time.  We  also  offer  extensions 
for  modeling  asynchronous/synchronous  concurrency  and  infinite  computations. 


1  Introduction 

Two  major  approaches  to  modal  logics  of  programs  are  dynamic  logic  [Pr]  and 
temporal  logic  [Pn].  Propositional  dynamic  logic,  PDL  [FL]  is  a  natural  'dy¬ 
namic’  extension  of  the  propositional  calculus,  in  which  programs  are  inter¬ 
mixed  with  propositions  in  a  modal-like  fashion.  Formulas  of  PDL  can  express 
many  input/output  properties  of  programs  in  a  natural  way.  Moreover,  valid¬ 
ity/satisfiability  in  PDL  is  decidable  in  exponential  time,  and  the  logic  has  a 
simple  complete  axiomatization  [KP].  PDL  is  thus  a  suitable  system  for  reason¬ 
ing  about  the  input/output  behavior  of  sequential  programs  on  the  propositional 
level.  However,  PDL  is  unsuited  for  dealing  with  the  continuous,  or  progressive 
behavior  of  programs,  i.e.,  the  situations  occuring  during  computations.  The 
need  for  reasoning  about  continuous  behavior  arises  naturally  in  the  study  of 
reactive  and  concurrent  programs. 

The  main  approach  proposed  in  response  to  this  need  is  temporal  logic,  TL 
[Pn],  in  which  assertions  can  be  made  naturally  about  the  progressive  behavior  of 
programs.  In  particular,  TL  can  easily  express  freedom  from  deadlock,  liveness, 
and  mutual  exclusion.  The  basic  versions  of  TL,  however,  are  not  compositional, 
in  the  sense  that  their  treatment  of  a  well-structured  program  does  not  derive 
directly  from  their  treatment  of  its  components.  Indeed,  TL  usually  does  not 
name  programs  at  all,  but  refers  to  instructions  and  labels  in  a  fixed  program. 
Although  TL  can  discuss  the  synthesis  of  complex  programs  from  simpler  ones 
to  some  extent  using  at  predicates,  this  method  is  rather  cumbersome. 

This  dichotomy  between  the  dynamic  and  temporal  logic  approaches  has 
prompted  researchers  to  try  to  combine  the  best  of  the  two  in  what  is  generally 
called  process  logic.  Accordingly,  a  system  called  PL  was  proposed  in  [HKP]. 
It  borrows  the  program  constructs  and  modal  operators  [  ]  and  (  )  from  DL, 
and  the  temporal  connectives  suf  (similar  to  until)  and  f  (standing  for  first) 
from  TL,  and  combines  them  into  a  single  system.  The  expressive  power  of  PL 
is  greater  than  that  of  PDL  and  of  TL,  and  its  validity/satisfiability  problem 
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was  shown  in  [HKP]  to  be  decidable,  though  it  is  not  known  to  be  elementary.^ 

There  are  some  inconvenient  features  of  PL,  including  the  asymmetry  of 
its  central  path  operator,  suf,  and  the  fact  that  its  formula  connectives  are 
somewhat  weaker  than  its  program  operators.  A  proposal  that  overcomes  these 
problems  is  the  regular  process  logic,  RPL,  of  [HP].  In  RPL,  the  operators  suf 
and  f  are  replaced  by  chop  and  slice,  corresponding  essentially  to  Kleene’s 
regular  operations  of  concatenation  and  star.  In  this  way,  the  regular  oper¬ 
ations  on  programs,  aU  (3,  ajS,  a*,  have  natural  counterparts  on  formulas: 
A  V  y,  X  chop  Y  and  slice  A.  It  is  shown  in  [HP]  that  RPL  is  even  more 
expressive  than  PL,  and  that  its  validity  problem  is  also  decidable  but  nonele¬ 
mentary. 

Using  the  fact  that  in  RPL  both  program  and  path  operators  are  those  of 
regular  expressions,  and  that  programs  and  formulas  are  interpreted  over  paths, 
a  uniform  process  logic  R  was  defined  in  [HP].  In  R,  formulas  are  constructed 
inductively  from  atomic  propositions  and  binary  atomic  programs,  using  a  single 
set  of  regular  operators.  It  was  shown  in  [HP]  that  R  is  more  expressive  than  RPL 
with  binary  atomic  programs,  and  is  decidable  (though,  again,  nonelementary). 

In  the  interest  of  obtaining  a  useful  process  logic  decidable  in  elementary 
time,  an  automata-oriented  logic,  YAPL,  was  defined  in  [VW].  In  YAPL,  formu¬ 
las  are  constructed  using  finite  automata  for  both  temporal  (path)  connectives 
and  for  constructing  compound  programs  from  basic  (atomic)  ones.  There  is  a 
clear  distinction  between  state  and  path  formulas  in  YAPL,  atomic  programs 
are  binary  and  atomic  formulas  are  restricted  to  being  state  formulas.  YAPL 
is  indeed  shown  in  [VW]  to  be  decidable  in  elementary  time  (even  over  infinite 
paths).  YAPL  formulas,  however,  can  be  somewhat  less  intuitive  and  not  that 
easy  to  comprehend. 

In  the  present  paper,  we  try  to  combine  some  of  the  advantages  of  previous 
methods  by  introducing  a  new  process  logic  that  is  compositional,  uniform  in  its 
treatment  of  programs  and  formulas,  expressive  enough  to  capture  the  interest¬ 
ing  path  properties  mentioned  in  the  literature  in  a  natural  way,  explicit  in  its 
treatment  of  concurrency,  and  elementary  decidable. 

We  term  our  basic  formalism  computation  paths  logic  (CPL).  A  single  set  of 
regular  operators  acts  on  both  transition  formulas  (programs)  and  state  formu¬ 
las.  For  example,  a*  •  P  •  6  is  a  CPL  formula.  (Here  a  and  b  are  atomic  programs 
and  P  is  an  atomic  state  formula.)  Intuitively  this  formula  means:  “perform  ac¬ 
tion  a  some  nondeterministic  number  of  times,  check  for  property  P  and  then  do 
action  6”.  An  important  operator  in  CPL  is  ‘fl’  —  pathwise  intersection.  Thus, 
/  n  ^  is  true  on  paths  that  satisfy  both  /  and  g.  Using  this  operator,  it  is  possi¬ 
ble  to  express  a  large  variety  of  properties  of  computation  paths.  For  example, 
a  n  {skip*  •  P  ■  skip*),  where  a  is  a  program  and  P  is  a  proposition,  is  true  on 
a-paths  that  contain  some  F-state.  Note  that  ar\b,  for  atomic  programs  a  and  b 
is  true  only  for  paths  which  are  both  a-paths  and  6-paths,  and  is  not  expressible 
by  PDL  programs  or  formulas. 


^  Some  versions  of  PL  have  been  shown  to  be  nonelementary  [Ha],  but  it  is  still  not 
known  whether  PL  itself  is  elementary. 
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Unlike  PL  and  its  descendants,  RPL  and  R,  we  have  decided  not  to  include 
the  modal  operators  [  ]  and  (  )  in  CPL.  The  reason  is  as  follows.  Consider 
a  PL/RPL/R  formula  of  the  form  [a] 9?,  where  a  is  a  program  and  9?  is  a  path 
property.  While  one  might  expect  this  formula  to  be  true  on  all  a-paths  that 
satisfy  9?,  in  PL  it  is  defined  to  be  true  on  all  paths  p  which,  when  extended  by 
an  O'-path  r,  result  in  a  path  p  ■  r  satisfying  9?,  This,  however,  corresponds  to 
the  above  intuition  only  when  p  is  a  path  of  length  0,  i.e.,  a  state.  This  broader 
(and  somewhat  complicated)  definition  in  PL  is  an  unavoidable  outcome  of  the 
wish  of  the  authors  of  [HKP]  to  have  only  path  formulas,  but  at  the  same  time 
use  (  )  and  [  ]  as  in  PDL.  (For  example,  they  wanted  to  be  equivalent 

to  (ot){0)ip.) 

To  make  our  logic  elementary  decidable,  we  use  a  special  form  of  negation. 
Specifically,  negation  in  CPL  is  not  taken  relative  to  the  set  of  all  paths  (as 
is  done,  e.g.,  in  PL/RPL/R).  In  fact,  a  negated  formula  is  a  state  property, 
made  true  in  any  state  that  is  not  the  initial  state  of  a  path  that  satisfies  the 
argument  formula.  For  example,  ^(a  •  P)  asserts  “it  is  not  possible  to  carry  out 
a  computation  of  a  ■  P  from  the  present  state” .  While  this  form  of  negation  is 
weaker  than  negation  relative  to  all  paths,  most  interesting  path  properties  are 
still  expressible. 

In  Section  3,  we  show  that  CPL  is  elementary  decidable,  by  reducing  its 
satisfiability  problem  to  that  of  APDL,  the  version  of  PDL  in  which  programs 
are  represented  by  finite  automata  rather  than  regular  expressions  [HS2].  The 
reduction  is  rather  involved,  and  combines  ideas  from  both  [Pe]  and  [SPH]. 

In  Section  4  we  propose  an  extension  of  CPL  for  modeling  concurrent  pro¬ 
cesses,  called  ICPL,  It  uses  ^||’  to  denote  interleaving.  This  might  be  termed 
asynchronous  concurrency.  Even  though  the  interleaving  operator  itself  is  very 
intuitive,  combining  it  with  other  operators  (especially  ^fl’)  turns  out  to  be  rather 
technically  involved.  Nevertheless,  ICPL  is  also  decidable  in  elementary  time. 

To  model  synchronous  concurrency,  we  introduce  a  further  extension  in  Sec¬ 
tion  5,  called  SICPL  (ICPL  with  synchronization).  In  SICPL,  which  is  shown 
to  be  elementary,  interleaving  can  be  synchronized  with  respect  to  subsets  of 
atomic  programs.  For  such  a  subset  syn,  and  formulas  /  and  g,  the  interleaving 
of  /  and  g  synchronized  on  syn  is  expressed  by  /  |  syn  \  g  (the  notation  is  apt, 
since  ‘||’  denotes  the  special  case  where  syn  =  0).  For  example,  the  formula 
{a  U  b)  ■  c  \  a,b  \  (aU  c)  ’  P  •  (b  U  c)  is  true  only  in  paths  of  the  form: 

a  c  P  c  a  P  C  c  c  P  b  C 

- ^ ^ ^  ^ ^ ^  ^ ^ ^ 

A  further  elementary  extension  of  CPL  for  expressing  properties  of  infinite  com¬ 
putations,  wCPL,  is  defined  in  Section  6. 

2  Definitions  and  Basic  Observations 

Definition  1.  A  path  over  a  set  S  is  simply  a  non-empty  finite  sequence  of 
elements  of  S.  The  notions  first,  last  and  fusion,  denoted  p  •  q,  are  defined  in 
the  usual  way. 
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We  now  define  computation  paths  logic,  CPL  for  short.  It  has  two  sorts, 
a  set  ASF  of  atomic  state  formulas  (propositions),  and  a  set  ATF  of  atomic 
transition  formulas  (programs).  The  set  of  formulas  is  defined  as  the  least  set 
containing  ASF  and  ATF,  and  such  that  if  /  and  g  are  formulas,  then  so  are 
(/*),  (f  -g):  if^g)  and  (fCig).  (We  often  omit  the  parentheses  where 
there  is  no  confusion.) 

CPL  formulas  are  interpreted  over  models  M  =  (5'^,Pjvf),  where  is  the 
set  of  states,  p^{P)  C  (5^)  for  each  P  GASF,  and  p^(a)  C  5^  x  5^  for  each 
a  gATF.  In  addition,  is  is  extended  to  all  formulas  as  follows: 

Pm  U -9)  =  Pm  (/)  •  Pm  (p)  Pm  (/  U  !7)  =  Pm  (/)  U  Pm  (p) 

Pm(/  np)  =  Pm{/)  ^PmCp)  Pm(/*)  -  Pm(/)* 

Pm(-'/)  =  (S^)\first[p^{f)) 

(We  often  leave  out  the  M  subscript  of  S  and  p.)  A  path  p  in  a  model  M  satisfies 
a  CPL  formula  /,  written  M,  p  \=  f,  when  p  G  p^if)'  ^  formnla  /  is  satisfiahle 
iff  M,  p  |=  /  for  some  path  p  in  some  model  M .  A  state  «  in  a  model  M  satisfies 
a  CPL  formula  /,  written  M,  s  ^  /  iff  there  exist  a  path  satisfying  /  whose 
first  state  is  s. 

Example:  Consider  the  CPL  formula  ip  :  {P  ^  a)*  ■  Q  fl  (6  U  c)*  ■  ->(6  •  P)  •  a, 
where  P,  Q  e  ASF  and  a,  h  gATF.  In  the  model  illustrated  in  the  figure  below, 
paths  that  satisfy  p  are  (among  others):  (1,  2,  3,  4,  5),  (1,  2,  3,  1,  2,  3) 
and  (1,  2,  3,  1,  2,  3).  On  the  other  hand,  a  path  that  does  not  satisfy  p  is 
(1,  2,  7,  8,  9)  (this  is  because  (8)  ^  -1(6 -P)). 


For  CPL  formulas  /  and  g,  it  is  sometimes  convenient  to  use  the  following 
abbreviations:  /?  instead  of  f  ^  g  instead  of  /?  U  p?  and  f  A  g  instead 

of  /?  n  <7?.  Regarding  transitions,  it  useful  to  use  the  following  abbreviations: 
skip  instead  of  Uo^atf®)  path  instead  of  skip*  and  true  instead  oi  path! .  Note 
that  path  holds  in  every  path  in  which  consecutive  states  are  connected  by  some 
atomic  transition.  Moreover,  it  follows  from  the  semantics  of  CPL  that  for  every 
/  gCPL  and  every  path  p  in  any  model  M,  if  p  G  Pm(/)  then  p  [=  path.  So  that 
path  plays  the  role  of  True’  for  paths  that  correspond  to  formulas.  The  formula 
true  is  a  ‘state  version’  of  path  and  is  true  in  every  path  of  length  0,  i.e.,  in  every 
state  in  every  model. 

Let  us  demonstrate  how  to  express  some  useful  path  properties  in  CPL. 

-  The  existence  of  some  segment  of  the  path  satisfying  /  is  expressed  by 
someseg  (/)  =  path  ■  f  -  path. 
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The  existence  of  some  prefix  of  the  path  satisfying  /  is  expressed  by 
somepre  (/)  =  f  -path. 

-  The  existence  of  some  suffix  of  the  path  satisfying  /  is  expressed  by 
somesuf  (/)  =  path  •  /. 

-  The  existence  of  some  state  in  the  path  satisfying  /  is  expressed  by 
somestate  (/)  =  sorneseg  (/?). 

-  An  operator  similar  to  Q  (nexUime)  of  TL  is  next  (/)  =  skip  ■  f. 

-  An  operator  similar  to  U  of  TL  is  /  until  g  =  (f  •  skip)*  •  g. 

CPL  can  clearly  be  viewed  as  a  pathwise  extension  of  PDL.  It  is  not  too  difficult 
to  show  that  PDL<5^CPL  in  expressive  power,  where  we  only  consider  state 
formulas  of  CPL.  Considering  other  process  logics,  CPL  can  be  thought  of  as 
a  restricted  version  of  the  logic  R  of  [HP],  so  that:  PDL<5i£CPL<R.  From  this, 
and  the  fact  that  R  is  decidable  [HP],  we  can  conclude  that  CPL  is  decidable. 
Since  R  is  nonelementary  [Pe],  this  yields  a  nonelementary  decision  procedure  for 
CPL.  We  will  show  in  the  next  section,  however,  that  CPL  is  in  fact  elementary. 

3  CPL  is  Elementary  Decidable 

In  this  section  we  show  that  satisfiability  of  CPL  formulas  is  decidable  in  el¬ 
ementary  time.  This  will  be  done  in  two  steps.  In  the  first,  we  carry  out  a 
reduction  from  the  satisfiability  problem  of  CPL  to  the  satisfiability  problem 
of  CPL  over  one-action-per-transition  models.  These  one-action-per-transition 
{oapt,  for  short)  models  are  defined  below.  (These  models  were  used  in  [Pe]  for 
the  logic  R.)  In  the  second  step  we  carry  out  a  reduction  from  the  satisfiability 
problem  of  CPL  over  oapt-models  to  the  satisfiability  problem  of  APDL. 

Definition  2.  A  model  M  is  called  an  oapi-model  relative  to  the  ATF  Ui, a-n 
if  for  every  I  <  i  ^  j  <  n,  p{ai)  fl  p{aj)  =  0.  A  CPL  formula  f  is  oapt-satisfiable 
iff  there  exist  some  oapt-model  which  satisfies  f. 

Lemma  3.  For  every  CFL  formula  f  over  {a\  . .  .a„},  there  exists  a  CPL  formula 
f  (over  a  new  ATF )  such  that  f  is  saiisfiable  iff  f'  is  o3i.pt-satisfiable. 

Proof:  Let  /  be  a  formula  f  over  {ui,  We  define  a  set  ATF'  of  2^  —  1 

new  symbols  (to  be  used  as  the  atomic  transition  formulas  of  /'),  each  of  the 
form  where  Ck  G  {k,  k}, 

ATF'  I  V  1  <  <  n  ,  G  ^}}  \  {ai2...n}  • 

Let  /'  be  the  formula  obtained  from  /  by  replacing  every  appearance  of  ak ,  for  1  < 
k  <  n,  with  pk  —  lj{ci  c^icfc=fc}  The  following  claim  (the  proof  of  which 

we  omit  here)  completes  the  proof  of  the  lemma. 

Claim:  /  zs  saiisfiable  <=>  /'  is  odcpi-satisfiable.  ■ 

As  preparation  for  the  reduction  to  APDL,  let  us  start  the  discussion  in 
the  framework  of  PDL.  Recall  that  a  PDL  model  is  also  a  CPL  model;  note, 
however,  that  while  CPL  formulas  are  interpreted  over  paths,  PDL  formulas  are 
interpreted  over  states.  To  overcome  this  dichotomy  we  shall  relate  paths  to  PDL 
programs  in  the  following  way: 
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Definition 4.  For  a  PDF  program  a  and  a  path  p  =  (pi,--  -  ,Pk)  in  a  model 
(S',  r,  i?,),  p  e  cx  is  defined  by  induction  on  the  structure  of  a:  If  a  G  ATF  then 
p  G  a  iff  =  1  and  (po,Pi)  G  R{a)]  p  G  a  U  /?  iff  p  G  a  or  p  G  /?;  p  G  /?  iff 
there  are  paths  q  ^  cx  and  r  G  /?  with  p  —  q  '  r;  p  E  cx*  iff  p  G  a*  for  some  i  >  1 
or  p  =  (po);  p  G  <p?  iff  p  =  (po)  and  (po,Po)  G 

Via  this  association  we  can  view  PDL  programs  as  being  carried  out  along  paths 
rather  than  as  binary  relations.  For  the  reduction,  however,  it  is  more  convenient 
to  use  the  automata  version  of  PDL,  namely  APDL  [HS2].  The  reason  for  this 
is  that  ‘ft’  can  be  handled  more  economically  by  automata  than  by  regular 
expressions.  (This  also  applies  to  other  operators  used  in  the  extensions  of  CPL 
we  define  later  on.)  Even  though  APDL  formulas  are,  in  general,  more  succinct 
than  their  equivalent  PDL  formulas,  satisfiability  for  APDL  can  be  decided  in 
EXPTIME  [HS2].  This  is  also  the  case  for  deciding  satisfiability  over  oapt- 
models.  For  if  M,  s  <p,  then  M  can  be  transformed  into  an  oapt-model  of  p 
(by  duplicating  states). 

We  shall  use  this  to  get  an  elementary  decision  procedure  for  CPL  by  carrying 
out  a  reduction  from  CPL  into  APDL.  Relating  paths  in  a  model  to  APDL 
programs  is  done  as  in  Def.4,  i.e.,  if  ct  is  an  automaton  (APDL  program)  then 
p  G  O'  iff  p  G  r(Q'),  where  r(a)  is  a  regular  expression  denoting  the  language  of 

a. 

Lemma  5.  For  every  CPL  formula  f  there  exists  an  APDL  program.  (NFAJ  Aj , 
such  that  for  every  path  p  in  every  oapt-model,  p  G  p(/)  iff  P  F:  A/. 

Proof:  The  APDL  automaton  (program)  Aj  corresponding  to  the  CPL  formula 
/  is  built  by  induction  on  the  structure  of  /.  Here  we  briefly  describe  the  following 
two  (non-routine)  cases.  For  we  let  A^g  be  a  two  state  NFA  accepting  the 
(one  word)  language  {{[A  g]  false)?} .  For  pDh  we  have  to  be  careful  since  the 
in  CPL  is  intersection  in  the  path  sense  rather  than  in  the  language  sense.  We 
use  the  fact  that  we  are  dealing  with  oapt-models  and  build  Agnh  that  simulates 
both  Ag  and  Ah  synchronizing  on  ATF-letters.  ■ 

Theorem 6.  If  we  fix  ATF  to  be  a  subset  of  {ai,  . .  .,a„},  then  satisfiability  of 
CPL  formulas  can  be  decided  in  2EXPTIME  . 

Proof:  Let  /  be  a  CPL  formula  over  ATFC  {ai,...,an}.  Use  Lemma  3  to 
construct  f  with  new  atomic  transition  formulas  ATF^,  such  that  /  is  satisfiable 
iff  f  is  oapt-satisfiable.  Note  that  since  the  set  {ai,...,an}  is  fixed,  \f  \  — 
Cl  •  I /I,  for  some  constant  ci.  By  Lemma  5,  there  exist  an  APDL  program  A// 
(in  the  form  of  an  NFA  over  the  alphabet  ATF'  U  Prop^)  such  that  for  every 
path  p  in  every  oapt-model  M:  p  G  p(/)  iff  p  G  Af>.  In  other  words;  p  G 
p(f)  iff  first{p)  1=  <Aj>>true.  It  is  known  [HS2],  that  satisfiability  of  APDL 
formulas  can  be  decided  in  deterministic  exponential  tirne.  One  can  easily  prove 
by  induction  on  the  structure  of  /'  that  \Aj>\  <  2^2  l/'|^  constant  C2 

(actually,  the  exponent  is  needed  only  for  the  ‘R'  case).  So  that  the  overall  time 
complexity  of  deciding  satisfiability  of  the  original  CPL  formula  /  is  bounded 
by  2^"^'^  for  some  constant  C3.  I 
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4  CPL  with  Interleaving 

The  motivation  for  adding  the  interleaving  operator  to  CPL  is  twofold.  Our 
primary  motivation  is  that  the  interleaving  operator  can  be  interpreted  as  the 
simplest  case  of  composition  used  in  algebraic  approaches  to  modeling  concurrent 
computation  (see,  e.g.,  [M]).  Interleaving  represents  the  case  where  processes 
run  concurrently  in  such  a  fashion  that  their  atomic  steps  can  be  arbitrarily 
interleaved  but  where  no  communication  between  them  takes  place.  This  form  of 
concurrency,  modeled  by  interleaving,  might  also  be  described  as  asynchronous. 
Second,  as  discussed  in  the  sequel,  using  interleaving  we  gain  succinctness. 

Let  us  now  define  ICPL  (CPL  with  interleaving).  The  syntax  of  ICPL  ex¬ 
tends  that  of  CPL  as  follows:  if  /  and  g  are  formulas,  then  so  is  (/  ||  g).  Turning 
to  the  semantics,  the  basic  difficulty  is  that  our  p,  which  associates  paths  with 
formulas,  is  not  informative  enough  to  capture  interleaving.  For  example,  we 
would  like  the  formula  (a  •  P)  ||  6  to  be  satisfied  by  the  paths:  a  •  P  •  h  (i.e.,  an 
a-transition  followed  by  a  6- transition,  with  P  true  in  the  intermediate  state), 
a-h-P  and  h-a-P.  However,  paths  of  the  second  form  would  not  appear  if  we  used 
p{a  •  P)  and  p{b),  since  p{a  •  P)  contains  only  ‘a’-paths  with  P  at  the  last  state. 
To  solve  this  problem  we  shall  use  a  more  detailed  version  of  p.  The  idea  is  that 
now  Pm  if)  will  contain,  in  addition  to  paths  in  M  that  are  associated  with  /, 
some  ‘evidence’  of  this  association.  We  will  associate  with  each  formula  (via  this 
extended  p)  a  set  of  computation  paths  (defined  below)  rather  than  a  set  of  (or¬ 
dinary)  paths.  A  computation  path  in  a  model  M  consists  of  two  objects:  a  com- 
puiaiion,  which  is  a  sequence  of  transitions  accompanied  by  a  sequence  of  prop¬ 
erties  (state  formulas);  and  an  ordinary  path  over  M,  i.e.,  a  sequence  of  states 
of  M .  To  get  a  feeling  for  this,  the  figure  below  illustrates  a  computation  path: 

P  a  ~^(b*Q)  UfC  P 

Here,  the  path  is  {s,t,r),  i.e.,  the  sequence  of  states,  and  the  computation  is 
((n,{a,c}),  (P,-(6-Q),i2)). 

Definition  7.  The  set  of  state  formulas  SF  is  the  minimal  set  of  ICPL  formulas 
that  contains  ASF,  contains  all  formulas  of  the  form  -■/,  and  is  closed  under 
•  and  n.  For  state  formulas  /  and  g  of  the  form  /  =  -  P  ■  ■  •  '  ^ 

g  —  g^  ■  g'^  .  .  .  ■  g\  where  kj>  1,  let 

i  ■■■-{P  r\g^),  k^i 

/hff=i  (/ins>)-...-(/'n/)./'+' k>i 
[(/'  n  n  g\k  <  l 

Definitions.  A  computation  is  a  pair  c  =  {Trauc,  Vale),  where  Trauc  is  a 
path  over  the  set  2^^^  —  0  and  Vale  is  a  path  of  length  |Tranc|  +  1  over  the  set 
SF.  The  length  of  c,  denoted  |c|,  is  |T^a/c|. 


We  now  define  several  operations  on  computations.  For  this  we  use  the  two 


415 


computations; 

Trane  VaJc  Trang  VaU 

c=:  ((tu.^.,tk),{fo,---Jk))  and  d  =  {{ru  . . .  ,ri),{go,  ■  -  ■ ,  9i)) 

-  {(Trane);  {Tr  and),  (Fa^HV^a/rf)),  where  (^i,  .  . . ,  4)  ;  (T'i,...,n)  = 

(ti, . .  .  ,tk,ri, .  . ,  ,ri)  and  (/o, .  . . ,  fk)'(9o^  ■  •  •  j9i)  —  (/o^  •  •  • »  /fc  ’  9o,  ■  -  9i)- 

-  If  c  and  d  are  of  the  same  length  (i.e.,  k  =  /), 

then  c  n  d  {(ti  U  ri,  .  . . ,  ^jb  U  r^),  (/oH^o,  •  •  •  i  fk^9k))- 

The  next  operation  we  want  to  define  is  c  ||  d.  In  general,  c  ||  d  is  a  set  of  computa¬ 
tions.  A  computation  in  c  ||  d  is  obtained  by  sequentially  executing  portions  from 
c  or  from  d.  Let  us  make  this  notion  more  precise.  First,  denote  by  A  C  {0, . . . ,  A;} 
the  set  of  indices  s.t.  i  G  h  iff  fi  is  of  the  form  //  -f^  •  •  '  fi  (and/as/(/i)  >  2). 

Next,  define  a  formula  portion  of  c  to  be  any  element  of  the  set 


Finally,  a  portion  of  c  is  a  formula  portion  or  a  transition  portion  of  c,  wheie  a 
transition  portion  of  c  is  an  element  of  Portions  of  d  are  defined  in  a 

similar  way. 

Constructing  a  computation  e  G  c  ||  d  is  carried  out  as  follows.  Initialize  Trane 
and  Vale  with  (  ),  and  set  pointers  to  the  leftmost  formula  portions  of  c  and 
d.  While  there  remain  portions  of  c  and  d,  that  have  not  been  dealt  with,  non- 
deterministically  add  to  e  the  next  portion  of  c  or  that  of  d,  and  advance  the 
corresponding  pointer  to  the  next  portion,  where  the  successor  of  a  transition 
is  a  formula  and  the  successor  of  a  formula  portion  is  either  the  next  portion 
of  the  same  formula  or  the  next  transition,  if  the  current  portion  is  last  in  the 
formula.  When  one  of  c  or  d  has  been  consumed,  simply  add  to  e  the  remaining 
portions  of  the  other. 

Definition 9.  A  computation  path  in  a  model  M  is  a  pair  p  =  (Statp,  Cp),  where 
Statp  is  a  nonempty  path  over  Sm  ordinaxy  path  in  the  model  M)  and 

Cp  is  a  computation  with  |Ca/cpl  =  \Statp\. 

For  a  computation  path  p  =  (Statp,  Cp),  we  denote  Traiic^  and  Valc^  by 
Tranp  and  Valp,  respectively.  We  intend  to  use  a  computation  path  p  as  follows: 
Statp  will  be  the  states  along  p,  Tranp  will  be  the  sequence  of  transitions  along 
p,  and  Valp  will  be  the  sequence  of  state  formulas  satisfied  in  states  along  p.  For 
example,  a  computation  path  p  with  Statp  =  (s,  t,  r),  Tranp  —  (a,  {a,  c})  and 
Valp  =  (P,  ->(b  •Q),R)  is  illustrated  in  the  figure  prior  to  Def.7. 

We  have  defined  •  both  on  computations  and  on  paths,  and  we  now  use  these 
together  to  define  p  •  Q,  for  computation  paths  p  and  q  (and  then,  extend  it  to 

sets  of  computation  paths  in  the  usual  way):  p  '  q  —  (Statp  -  Statq,  Cp  •  Cq)  . 
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Definition  10.  Let  CP  be  a  set  of  computation  paths  in  a  model  M.  A  path  p  = 
{5o,  •  •  • ,  <5^0  hi  M  is  CP  consistent  with  a  computation  c  =  ((ti, . . .  {fo,  ■  ■  ■ ,  fi)) , 
if  the  following  conditions  are  satisfied:  (i)  |p|  =  |c|  (i.e.,  k  =  1),  (ii)  For  every 
0  <  ^  ~  there  exist  q  G  CP  s.t.  Statq  =  {si,JJi^i)  and  TraUq  =  (L),  and 

(iii)  For  every  0  <  i  <  k,  there  exist  q  £  CP  s.t.  Statq  =  (s*)  and  Valq  —  (/*)• 

We  can  now  define  the  semantics  of  ICPL.  Formulas  are  interpreted  over  the 
same  models  as  in  CPL,  that  is,  models  of  the  form  M  =  (-5'^^^,/?^),  where  is 
the  set  of  states,  pI^{P)  Q  (5),  for  every  element  P  CASF,  and  C  S  x  S, 

for  every  element  a  eATF. 

Next,  is  extended  by  induction  to  a  function  which  assigns  a  set 
P^fif)  computation  paths  to  every  ICPL  formula  /.  The  set  of  all  compu¬ 
tation  paths  assigned  to  formulas  in  this  way  (i.e.,  those  that  are  in  p^{f)  for 
some  /)  is  denoted  CP{M).  All  the  inductive  cases  in  the  definition  of  p^  are 
straightforward,  except  for  the  following  two: 

PmU  ^9)  =  I  C  Pm(/)>  9^  Pm{9)  s.t.  Stair  =  Statp  =  Statq 

and  Cr  =  Cp  n  Cq} 

PhAJ  II  9)  —  I  Stair  is  CP{M)  consistent  with  Cr  and  Cr  C  {cp  ||  Cq), 
for  some  p  G  9  ^  Pm{9)] - 

Definition  11.  An  ICPL  formulaf  is  satisfied  in  a  path  p  of  a  model  M,  written 
M,  p  \=  /,  iff  p  =  Statq  for  some  computation  path  q  £  p^if)-  f  is  satisfiable 
iff  M,  p  \=  f  foT  some  path  p  of  some  model  M. 

How  does  ICPL  relate  to  CPL?  Recall  that  ICPL  is  intended  to  be  CPL  extended 
with  the‘||’-operator.  While  syntactically  it  is  clear  that  CPLjlCPL,  semantically 
this  may  seem  less  obvious  due  to  the  differences  in  the  definitions.  We  there¬ 
fore  proceed  by  showing  that  under  the  canonical  correspondence  between  CPL 
models  and  ICPL  models,  that  is,  p^PL  =  P^cpl^  indeed  the  case. 

Proposition  12  For  every  CPL  formula  f  and  every  (ordinary)  path  p  in  any 
model  M,  M,p\=  f  iff  M,p\=  f. 

Proof:  Omitted.  M 

In  what  sense  is  ICPL  Tetter’  than  CPL?  Well,  using  the  well  known  fact  that 
regular  sets  are  closed  under  interleaving  it  is  not  difficult  to  prove  that  ICPL 
and  CPL  have  the  same  expressive  power.  Nevertheless,  ICPL  has  two  important 
advantages  over  CPL.  The  first  is  clarity  in  modeling  asynchronous  concurrent 
computations.  For  example,  consider  the  following  two  computations:  (i)  Execute 
a,  observe  P  and  then  perform  b.  (ii)  Observe  Q  and  then  execute  b  followed  by 
a.  In  ICPL,  we  can  use  the  formula  a  •  P  ■  b  ||  Q  -  b  ■  a  to  model  computations 
that  arise  from  running  these  two  in  parallel,  while  in  CPL  it  appears  that  one 
must  use  a  much  more  cumbersome  formula  that  explicity  lists  many  of  the 
possible  interleavings.  The  second  (and  related)  advantage  of  ICPL  over  CPL  is 
succinctness.  It  is  known  that  the  use  of  the  interleaving  operator  can  shorten  a 
regular  expression  by  an  exponential  amount  [F,  MS].  It  is  true  that  interleaving 
in  ICPL  is  (in  general)  not  interleaving  in  the  language  sense.  However,  ICPL 
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formulas  that  use  only  ATF  and  the  operators  'U’  and  ‘||’  correspond 

essentially  to  regular  expressions  (extended  with  interleaving  operator)  over  the 
alphabet  ATF.  As  to  decidability,  we  have: 

Theorem  13.  Satisfiability  e/ ICPL  formulas  with  ATFC  can  be 

decided  in  2EXPTIME  . 

5  ICPL  With  Synchronization 

ICPL  is  suited  for  modeling  asynchronous  concurrency.  To  model  synchronous 
concurrency  as  well,  we  introduce  ICPL  with  synchronization  (SICPL).  All  ICPL 
formulas  are  SICPL  formulas.  In  addition,  if  /  and  g  are  SICPL  formulas  and 
syn  is  a  subset  of  ATF,  then  /  |  syn  |  is  a  SICPL  formula.  (The  set  syn  has 
to  be  written  out  in  full,  for  example  as  in  (a  •  6)*  •  P  |  a,  6  |  (a  U  6).)  Intuitively, 
/  I  syn  I  g  represents  the  interleaving  of  /  and  g  synchronized  w.r.t.  syn.  See 
the  example  in  Section  1 . 

To  present  the  formal  semantics  of  SICPL  (which  will  not  be  given  here),  one 
has  to  modify  each  step  in  the  definition  of  pmH  ||  p)-  Here  we  have: 

Theorem  14.  Satisfiability  o/ SICPL  formulas  with  ATFC  {ai,...,a„}  can  be 
decided  in  2EXPTIME  . 

6  Infinite  Computations 

CPL  (and  its  extensions  ICPL,  SICPL)  are  input/output  oriented  and  are  there¬ 
fore  appropriate  for  stating  properties  concerning  programs  with  finite  compu¬ 
tations.  We  wish,  however,  to  make  it  possible  to  reason  about  processes  with 
possible  infinite  computations.  For  example,  we  would  like  to  say  that  the  infi¬ 
nite  model  P  ■  a  •  P  •  a .  .  .  (the  a’s  are  transitions,  and  the  P’s  signify  truth  in 
the  intermediate  states),  admits  in  addition  to  the  finite  computations  described 
by  (P  •  a)*  also  the  infinite  computation  (P  •  a)^.  With  this  idea  in  mind,  we 
introduce  the  extension  a;CPL. 

Basically,  one  would  like  u;CPL  to  extend  CPL  by  employing  the  new  operator 
‘cj'  and  to  use  formulas  of  the  form  /^,  where  /  is  a  CPL  formula.  The  most 
intuitive  interpretation  of  is  simply  to  associate  with  it  infinite  paths  that 
result  by  fusing  infinitely  many  (finite)  paths  of  /  (that  is,  take  p{f^)  as  p{f)^)  • 
Choosing  this  interpretation,  however,  forces  one  to  make  a  distinction  between 
‘cj-formulas’  (those  with  possibly  infinite  paths  corresponding  to  the  ui)  and 
‘finite  formulas’.  This  is  necessary  in  order  to  interpret  (or  to  forbid)  formulas 
of  the  form  f^  •  g,  ■  g"^ ,  etc. 

To  enable  a  uniform  representation,  we  have  decided  to  adopt  a  more  modest 
interpretation  of  as  follows.  We  shall  consider  f^  rather  as  a  test,  true  in 
states  (i.e.,  paths  of  length  0)  from  which  it  possible  to  repeatedly  carry  out 
computations  of  /  infinitely  often.  The  advantage  of  using  this  interpretation  is 
that  even  though  paths  associated  with  formulas  are  finite,  and  hence  all  CPL 
operators  are  applicable  and  retain  their  usual  meaning,  it  is  still  possible  to 
make  assertions  concerning  infinite  computations. 
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Definition  15.  An  uj-paih  over  a  set  S  is  an  infinite  sequence  of  elements  of  S. 
For  a  set  P  of  finite  paths,  let  =  {pi  •  P2  ‘  P3  •  • '  I  Vi  >  I,  Pi  ^  V}.  That  is, 
P^’  is  the  set  of  finite  and  infinite  paths  obtained  by  repeatedly  fusing  (finite) 
paths  from  P  infinitely  often. 

The  syntax  is  such  that  wCPL  contains  all  CPL  formulas,  and  in  addition  if  /  and 
(j  are  cjCPL  formulas,  then  so  are  (^/),  (/*),  (f^),  (Z'ff);  if  ^9)  ^nd  (fCig). 
As  for  semantics,  a;CPL  is  interpreted  over  the  same  models  as  CPL.  Given  a 
model  M  and  an  u^CPL  formula  /,  Pp^{f)  is  defined  exactly  as  in  CPL  with  the 
addition  of  the  clause:  ~  first  ((Pa-/(/))^)  • 

ljCPL  can  be  considered  to  be  a  'path  version’  of  RPDL  [HSl].  Indeed,  we 
can  extend  the  embedding  of  PDL  in  CPL  to  an  embedding  of  RPDL  in  a;CPL 
by:  (repeat{J^)y  =  Thus,  wCPL’s  expressive  power  is  at  least  as  that  of 

RPDL,  which  is  known  to  be  high  (for  example  it  exceeds  that  of  CTL*  [E].) 

Proving  that  cjCPL  is  elementary  decidable  is  done  by  reducing  its  satisfiabil¬ 
ity  problem  to  that  of  ARPDL  (the  automata  version  of  PDL+ repea/).  Here,  we 
omit  the  details,  and  only  mention  that  this  reduction  costs  at  most  an  exponen¬ 
tial  in  added  size.  Thus,  using  the  fact  that  ARPDL  is  decidable  in  3EXPTIME 
[VW],  we  have: 

Theorem  16.  Satisfiability  of  uCVL  formulas  with  ATFC  {ai,...,a„}  can  he 
decided  in  4EXPTIME  . 
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Abstract.  In  this  paper  we  develop  a  new  exponential  algorithm  for 
model-checking  infinite  sequential  processes,  including  context-free  pro¬ 
cesses,  pushdoxm  processes,  and  regular  graphs,  that  decides  the  full 
modal  mu-calculus.  Whereas  the  actual  model  checking  algorithm  res¬ 
ults  from  considering  conditional  semantics  together  with  backtrack¬ 
ing  caused  by  alternation,  the  corresponding  correctness  proof  requires 
a  stronger  framework,  which  uses  dynamic  environments  modelled  by 
finite-state  automata. 


1  Introduction 

Over  the  past  decade  model-checking  has  emerged  as  a  powerful  tool  for  the 
automatic  analysis  of  concurrent  systems.  Whereas  model-checking  for  finite- 
state  systems  is  nowadays  well-established,  the  theory  for  infinite  systems  is 
a  current  research  topic  (cf.  [BE97]).  Since  even  weak  branching  time  logics 
are  undecidable  for  infinite-state  systems  incorporating  parallel  operators,  much 
work  has  focused  on  the  verification  of  sequential  processes.  The  strongest  res¬ 
ults  obtained  so  far  show  the  decidability  of  monadic  second  order  logic  (MSOL) 
for  the  infinite  binary  tree  [Rab69],  pushdown  transition  graphs  [MS85],  regular 
graphs  [Cou90],  and  rational  restricted  recognizable  graphs  [Cau96].  However, 
all  decision  procedures  are  non-elementary  and  thus  not  applicable  to  practical 
problems.  Moreover,  MSOL  is  usually  too  expressive,  since  it  allows  to  distin¬ 
guish  even  bisimilar  models.  For  these  reasons,  the  modal  mu-calculus  is  seen 
as  an  attractive  alternative  for  specifying  behavioural  properties. 

The  model-checking  problem  for  sequential  processes  and  the  modal  mu- 
calculus  was  first  considered  in  [BS92].  The  authors  developed  an  iterative 
model-checking  algorithm  that  decides  the  alternation-free  part  of  the  modal 
mu-calculus  for  context-free  processes  based  on  a  conditional  formulation  of 
the  semantics  of  //-formulas.  Moreover,  in  [HS94]  it  is  shown  how  this  can  be 
done  using  tableaux-based  techniques,  allowing  local  model  checking.  Finally, 

*  This  work  was  supported  during  my  stay  at  IRISA  by  the  European  Community 
under  HCM  grant  ERBCHBGCT  920017,  and  during  my  stay  at  the  LFCS  by  the 
DA  AD  under  grant  D/95/14834  of  the  NATO  science  committee. 
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the  approach  was  also  extended  to  the  strictly  larger  classes  of  pushdown  pro¬ 
cesses  [BS95]  and  regular  graphs  [BQ97].  Since  alternation  of  fixpoints  gives 
rise  to  a  strict  hierarchy  [Bra96]  the  problem  of  model-checking  the  full  modal 
mii-calcuhis  has  still  been  open.  Only  recently,  Walukiewicz  presented  a  first 
exponential  model-checking  algorithm  for  pushdown  processes  based  on  games 
[Wal96]. 

In  this  paper  we  develop  an  alternative  algorithm  which,  essentially,  arises 
as  a  combination  of  extending  the  standard  iterative  model-checking  techniques 
with  conditional  reasoning,  in  order  to  capture  sequential  model  structures  in 
an  alternation-free  setting  [BS92,  BS95,  BQ97],  and  the  observation  that  altern¬ 
ating  fixpoints  require  some  kind  of  backtracking,  as  it  is  known  from  regular 
model  checking  (cf.  e.g.  [CKS92]).  Whereas  the  actual  model  checker  results 
from  this  combination,  the  corresponding  correctness  proof  requires  a  stronger 
framework,  which  uses  dynamic  environments.  In  contrast  to  the  ‘standard’ 
assertions,  which  suffice  algorithmically,  dynamic  environments  also  explicitly 
model  valuations  of  variables  that  occur  free  in  the  actual  fixpoint  computation. 
This  explicit  treatment  is  necessary  in  order  to  establish  the  link  between  the 
result  of  the  fixpoint  iteration  and  the  semantics  of  the  full  modal  mu-calculus. 

Fortunately,  all  this  additional  complexity  is  only  required  for  the  proof  and 
need  not  be  considered  for  an  implementation.  Taking  \C\  as  the  number  of 
transitions,  and  \Q\  as  the  branching  degree  in  the  finite  sequential  process  rep¬ 
resentation,  as  well  as  as  the  size  of  the  formula,  and  “ad”  as  the  alternation 
depth  of  the  formula  under  consideration,  the  overall  complexity^  is 

o(  1^1  *  (IQI  * 

Note  that  this  does  not  only  cover  context-free  and  pushdown  processes,  but  also 
regular  graphs,  which  are  not  covered  by  the  algorithm  proposed  by  Walukiewics. 
It  is  not  at  all  clear,  whether  a  similar  extension  is  also  possible  for  Walukiewics’ 
algorithms. 

The  plan  of  the  paper  is  now  as  follows.  The  next  section  describes  the  class 
of  processes  we  will  consider,  and  presents  the  modal  mu-calculus.  Subsequently, 
we  develop  our  model-checking  algorithm  which  is  proved  to  be  correct  in  Section 
4.  The  final  section  contains  our  conclusions  and  directions  for  future  research. 
Proofs  and  further  details  can  be  found  in  the  full  version  [BS97]. 

2  Processes  and  Specifications 

Infinite  sequential  processes  comprise  context-free  processes,  pushdown  pro¬ 
cesses,  and  regular  graphs.  In  this  paper  we  will  mainly  concentrate  on  the 
model-checking  problem  for  context-free  processes,  as  the  extension  to  push¬ 
down  processes,  respectively  regular  graphs,  can  be  obtained  following  the  lines 
of  [BS95],  respectively  [BQ97]. 

^  In  this  paper  we  neglect  the  optimization  of  [LBC’^94]  which  exploits  monotonicity 
arguments  and  would  reduce  ad{0)  to  ad(#)/2. 
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2.1  Coiitext-Fi-ee  Processes 

As  usual,  we  consider  labelled  transition  graphs  as  models  for  the  behaviour 
of  concurrent  systems,  since  they  allow  to  represent  the  underlying  semantics 
of  many  process  calculi.  In  particular,  we  are  interested  in  classes  of  infinite 
transition  graphs  which  can  be  finitely  represented  by  labelled  rewrite  systems. 

Definition  2.1.  A  labelled  transition  graph  is  a  triple  T  =  (S,Act,^)  where 
S  is  the  set  of  states^  Act  is  the  set  of  transition  labels  (or  actions),  and  ->  C 
S  X  Act  X  <5  is  the  transition  relation. 

Definition  2.2.  A  labelled  rewrite  system  is  a  triple  IZ  —  [V,Act,R)  where  V 
is  an  alphabet,  Act  is  a  set  of  labels,  and  R  C  V*  x  Act  x  V*  is  a,  finite  set  of 
reiorite  rules.  If  the  rewrite  rules  are  of  the  form  R  C  V  x  Act  x  V*  the  rewrite 
system  is  called  alphabetic. 

In  the  remainder  of  the  paper,  a  rewrite  rule  {u,a,v)  £  R  is  also  written 
as  u  A  u.  In  general,  rewrite  systems  are  used  to  define  a  rewrite  relation  on 
words  of  V*  where  a  rewrite  rule  may  be  applied  at  any  position.  The  technical 
development  of  this  paper  concentrates  on  rewritings  of  the  following  restricted 
form . 

Definition  2.3.  Let  R.  =  {V,Act,R)  be  a  rewrite  system.  Then  the  prefix 
rewriting  relation  of  R  is  defined  by  i — =^f  {  (uw,  a,  vw)  \  (u  A-  v)  £  R,w  £ 
V*  },  and  the  labelled  transition  graph  Tn  =df  ,  Act,  i — is  called  the  prefix 

transition  graph  of  R.  By  abuse  of  notation,  we  will  henceforth  write  uiv  A  vw 
instead  of  uw\ — vw. 

An  alphabetic  rewrite  system  which  is  interpreted  wrt.  prefix  rewriting  is 
called  a  context-free  system,  and  a  context-free  process  is  then  the  rooted  prefix 
transition  graph  of  a  context-free  system.  Note  that  the  states  of  a  context-free 
process  are  words  over  V,  and  we  will  henceforth  use  lower  greek  letters  a,  fi, . . . 
to  denote  them.  One  standard  example  for  a  context-free  process  is  the  prefix 
transition  graph  of  Cex  =  {  A  A  AB,  A  A  e,  B  A  e}  rooted  at  A. 

2.2  The  Modal  Mu-Calculus 

Nowadays  it  is  widely  accepted  that  system  properties  can  conveniently  be  ex¬ 
pressed  by  temporal  logic  formulas.  Particularly,  the  modal  mu-calculus  as  intro¬ 
duced  by  Kozen  [Koz83]  is  a  powerful  branching  time  logic.  It  combines  standard 
modal  logic  with  least  and  greatest  fixpoint  operators  which  allows  to  express 
very  complex  temporal  properties  within  this  formalism.  Due  to  its  express¬ 
iveness  and  its  conciseness  the  mu-calculus  can  be  regarded  as  the  “assembly 
language”  of  temporal  logics.  Formulas  of  the  mu-calculus,  given  in  positive 
form,  are  defined  by  the  following  grammar 

^  tt  I  f f  I  X  I  ^  V  ^  I  ^  A  ^  I  [a]<P  \  {a)d>  |  pX.^  \ 

where  A"  ranges  over  a  (countable)  set  of  variables  Var,  and  a  over  a  set  of 
actions  Act.  We  will  use  Lp  to  denote  the  set  of  all  mu-calculus  formulas. 
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Standard  Semantics  Given  Tr  =  (K*,  74ct, —>),  and  a  valuation  V  :  Var 
2^^* ,  the  inductive  definition  below  stipulates  when  a  context-free  process  a  G  V* 
has  the  property  0,  written  as  a  |=v  If  ^  to  satisfy  we  will  write 


O'  |=v 

O'  Nv  X 
a  [r=y  <Pi  V  02 
a  [=v  ^1  A  02 
a  |=v 
O'  Nv  [o]0 
a  |=v  }JiX.0 
a  |=v  i''X.0 


iff  a  G  V{X) 
iff  a  \=\;  01  y  a  \=\>  02 
iff  O'  |=v  ^1  A  a  ^2 
iff  3  a' .  a  -%  a'  A  a'  |=v  0 

iff  y  a' .  a  a'  ^  a’  j=v  ^ 

iff  V  5  C  (V  /?  G  /?  Nv[x^5]  0^PeS)^aeS 

iff  3  5C  1/*.  (V  /?  G  V\  P  e  S  ^  p  ^v[x^s]  Aa  G  5' 


where  V[X  5]  is  the  valuation  resulting  from  V  by  updating  the  binding  of 
.Y  to  S.  The  clauses  for  the  fixpoints  are  a  reformulation  of  the  Tarski-Knaster 
theorem  which  states  that  the  least  fixpoint  is  the  intersection  of  all  pre-fixpoints 
and  the  greatest  fixpoint  is  the  union  of  all  post-fixpoints.  As  a  consequence, 
states  satisfy  a  fixpoint  formula  iff  they  satisfy  the  unfolding  of  the  formula,  i.e. 
a  |=v  crX.0  iff  a  [=v  0[crX.0/X]  where  a  G  and  0[0/X]  denotes  the 

simultaneous  replacement  of  all  free  occurrences  of  YT  in  ^  by 

The  satisfaction  relation  defined  above  is  independent  of  the  valuation  if 
the  considered  formula  has  no  free  variables  in  which  case  we  will  drop  the 
index  V.  We  extend  our  satisfaction  relation,  moreover,  to  sets  of  formulas  by 
writing  q  |=  T  if  a  |=  for  all  ^  G  F.  Finally,  we  observe  that  the  usual 
denotation  of  formulas  as  the  set  of  states  where  the  formula  holds  is  obtained 
in  our  presentation  by  [^]v  =  {a  \  a  |=v  0}-  Next  we  define  some  standard 
notions  which  will  allow  us  to  deal  with  occurrences  of  subformulas  in  a  given 
formula,  as  well  as  to  measure  the  complexity  of  a  formula. 


Definition  2.4  (Binding).  A  formula^  is  called  well  named  if  every  fixpoint 
operator  in  0  binds  a  distinct  variable,  and  free  variables  are  distinct  from 
bound  variables.  With  each  well  named  formula^  we  then  associate  its  binding 
functton  V<p  which  assigns  to  every  bound  variable  A  of  ^  the  unique  subformula 
(7X.0{X)  of  0^  called  the  binding  definition  of  X  in  0. 


From  now  on  we  assume  that  every  formula  is  well  named. 


Definition  2.5  (Dependency  order,  Expansion).  Given  a  formula  0,  we 
define  the  dependency  order  over  the  bound  variables  of  denoted  by  <$,  as  the 
least  partial  order  such  that  if  X  occurs  free  in  V^(Y)  then  X  Y .  Moreover, 
for  every  subformula  0  of  0,  we  define  the  expansion  of  0  with  respect  to  as: 
{0}-p^  =^f0  [V^{Xn)/Xn]--.[V^{Xi)/Xi]  where  the  sequence  (Ai,...,A„) 
is  a  linear  ordering  of  all  bound  variables  of  0  compatible  with  the  dependency 
order,  i.e.  if  A,;  Xj  then  i  <  j. 
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Definition  2.6  (Subformulas,  Closure).  The  sub  formula  relation  on  Lfi,  de¬ 
noted  by  is  the  least  partial  order  on  Lfi  such  that  di 

X  {a}F,  F  [a]F,  F  ^  fiX.^,  and  F  X  uX.F,  for  z  =  1,  2  and  a  G  Act.  Given 
a  formula^,  we  define  the  closure  of  0  as  CL(^)  =  {F  |  F  Furthermore, 

if  CL{^)  =  {\Fi, .  . .  ,Fn}  we  will  henceforth  assume  that  the  subformulas  Fi  are 
linearly  ordered  compatible  with  i.e.  if  Fi  ;;;<  Fj  then  i  >  j. 

Definition  2.7  (Alternation  Depth). 

A  formula  ^  is  said  to  be  in  the  classes  Xq  and  Uq  iff  it  contains  no  lixpoint 
operators.  To  form  the  class  Xn+i ,  take  i^nUTJn,  and  close  under  (i)  boolean  and 
modal  combinators,  (ii)  ij.X.<P,  for  0  G  Xn+i,  and  (iii)  substitution  of  G  T’n+i 
for  a  free  variable  of  G  T'n+i  provided  that  no  free  variable  of  0'  is  captured 
by  0]  and  dually  for  77^+1-  The  (Niwinski)  alternation  depth  of  a  formula  <7, 
denoted  by  ad(^),  is  then  the  least  n  such  that  0  G  Xn+i  H  7In+i- 

Assertion-Based  Semantics  As  pointed  out  in  [BS92],  context-free  processes 
can  be  verified  by  considering  Hoare-logic  style  pre-condition/post-condition 
pairs  of  sets  of  formulas  for  each  of  the  nonterminals  occurring  in  the  context- 
free  system.  A  triple  {r}a{z\}  is  then  interpreted  as  a  satisfies  all  formulas 
of  r  if  we  assert  that  after  termination  of  a  exactly  the  set  of  formulas  A  holds. 
This  intuition  is  formally  captured  by  the  following  definition  of  assertion-based 
semantics  which  generalises  standard  semantics  by  taking  into  account  the  set 
of  formulas  which  hold  after  termination  of  a  process. 

Given  Tc  =  [V*  ,Act,\ — and  a  valuation  V  :  Var  -)■  the  inductive 
definition  below  stipulates  when  a  context-free  process  a  EV*  has  the  property 
0  under  the  hypothesis  that  after  termination  of  a  the  formulas  A  hold,  written 


as 

a  |=v  {0-)  AT) .  Ii 

a 

fails  to  satisfy  0  under  the  hypothesis  A, 

we  will  write 

a 

(^,A).  First 

we 

have  6  |=v  (^5  A)  iT0  e  A  and  then,  for  a 

a 

Nv 

a 

a 

Nv 

iff 

a  G  V(X) 

a 

Hv 

(^1  V  02i  A) 

iff 

a  l=v  V  a  |=v  (^2, 

Of 

Nv 

{01  A02,A) 

iff 

Oi  |=v  (^i,AT)  a  a  |=v  (^2,  AT) 

a 

Nv 

({«}<?,/!) 

iff 

3  a'.  Ao' 

Of 

1=V 

([a]^,A) 

iff 

\f  ah  a  a'  a'  |=v  {0,  A) 

a 

Nv 

((iX.0,  A) 

iff 

V 5 c  7*.  (V /? e  K*.  ^  (^, A)^peS) 

a  G  5 

a 

IT 

< 

{uX.‘P,A) 

iff 

3Scv\(\f  pev\^es=>^ 

]  (^.^)) 

Aa  eS 

As  in 

the  case  of  the 

standard  semantics,  we  will  use  a  |=:v  (T, 

A)  to  denote 

a 

Nv 

(^,  A),  for  al 

[0 

G  r. 

The  usefulness  of  the  assertion-based  semantics  is  underpined  by  the  follow¬ 
ing  proposition  [BS92]  which  states  that,  firstly,  the  assertion-based  semantics 
extend  the  standard  semantics,  and  secondly,  that  they  allow  to  reason  compos- 
itionally  about  context-free  processes. 
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Proposition  2.8.  The  assertion-based  semantics  is 

1.  an  extension  of  standard  semantics,  i.e.  given  a  closed  formula  0,  we  have, 

iff  a\={^,A,)  forA,  =  {TeCL{^)\e^(lT}v^}- 

2.  compositional  wrt.  context-free  processes,  i.e,  for  all  A,r  C  Lp, 

a(S  ^  (r,  A)  iff  3  17  c  L//.  a  ^  [T,  E)  and  0  |-  [E,  A) 

The  effectiveness  of  our  algorithm,  which  is  presented  in  the  next  section,  relies, 
in  particular,  on  Proposition  2.8.1,  as  it  shows  that  ^  can  be  verified  by  taking 
into  account  merely  the  semantics  of  all  subformulas  of 

3  The  Model-Checking  Algorithm 

In  this  section  we  develop  our  model-checking  algorithm  which  checks  closed 
//-formulas  with  arbitrary  alternation  depth  for  context-free  processes  in  expo¬ 
nential  time.  In  fact,  the  algorithm  coincides  with  a  backtracking  extension  of 
the  model-checker  of  [BS92]  which  deals  only  with  the  alternation-free  fragment 
of  the  modal  mu-calculus. 

Given  a  context-free  system  C  and  a  closed  formula  each  nonterminal 
A  eV  ^  defines  a  mapping  [^1  :  from  post- 

to  pre-conditions.  As  we  are,  however,  in  particular  interested  in  the  question 
whether  a  given  subformula  W  G  CL[0)  belongs  to  the  pre-condition  set  or  not, 
we  refine  this  notion  by  defining  the  following  functions,  called  characteristic 
property  transformers  (CPT). 

lAl^(A)-  /  ^  A\=(^,A) 

lAj  [A)  I  Q  otherwise 

Writing  IB  for  the  usual  lattice  of  boolean  values,  characteristic  property  trans¬ 
formers  are  elements  of  the  boolean  lattice  consisting  of  all  functions  from 
to  IB,  where  the  ordering,  and  the  meet  and  join  operations  respect¬ 
ively,  are  defined  argument-wise.  More  importantly,  they  can  be  obtained  as  a 
fixpoint  solution  of  an  appropriate  function  scheme,  called  the  property  trans¬ 
former  scheme  (PTS).  This  scheme  is  defined  by  the  rules  given  in  Figure  1,  and 
consists  of  two  parts.  The  first  part  copes  with  the  structure  of  the  context-free 
system,  as  well  as  with  the  semantics  of  the  formula,  and  defines  an  equation  for 
each  pair  (A,^)  G  V  x  CL{^).  The  second  part  deals  with  the  empty  process 
according  to  the  first  clause  of  the  assertion-based  semantics,  as  well  as  with 
composed  processes  according  to  Proposition  2.8.2.  Whereas  the  rules  for  the 
basic  cases  mimic  directly  the  semantics  of  the  subformula,  the  fixpoint  related 
equations  are  slightly  more  complicated  and  require  a  simultaneous  computa¬ 
tion  of  all  their  corresponding  transformers.  selA  then  simply  selects  the  A 
component  of  the  resulting  tuple.  The  other  auxiliary  function,  itieiriyEf,  tests  the 
membership  of  T  in  the  given  set  of  formulas.  It  returns  I  ii  ^  E  A  and  0 
otherwise. 

The  overall  structure  of  the  model-checking  algorithm  consists  now  of  the 
following  three  steps. 
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=  1 

lAp^^^  =  ml'  u  mp 

lAW 

=  0 

-  Mv‘  n  [Ap 

=  V{X,A) 

=U^4aNv 

=n^AJ“lv 

sel4(n{ 

■■>/*!) 

1  ,,je[l,n]]  -  ^A,  )) 

= 

sel^(U{ 

—  inem.;r(Z\) 

lAa%(A) 

=  lA}U{recm  1 

W5(4)  =  i}) 

Figure  1.  The  property  transformer  scheme. 


1.  Given  a  context-free  system  C  and  a  closed  //-formula^  construct  the  prop¬ 
erty  transformer  scheme  according  to  the  rules  given  in  Figure  1. 

2.  Solve  the  (finite)  fixpoint  problem  for  the  property  transformer  scheme. 

3.  Check  whether  [AiJ^(Z\e)  =  1  where  Ai  is  the  root  of  the  context-free 

system,  d^nd  A,  =  E  CL{0)  \  e  \=  }• 

In  Section  4  we  prove  that  the  second  step  of  the  algorithm  computes  trans¬ 
formers  which  reflect  the  assertion-based  semantics,  while  Proposition  2.8.1  now 
ensures  that  the  third  step  solves  the  model-checking  problem,  as  we  have 

=  l  iff  Ay\={^,A,)  iff  Ai\=$. 

Moreover,  the  ordinary  semantics  of  can  be  obtained  from  the  set  of  CPT's 
by  means  of  [^]  rr  {  a  E  P*  |  |[(a]^(Z\e)  =  1  }•  This  set  can  always  be  shown  to 
be  a  regular  set  of  states. 

As  expected,  the  required  backtracking  for  alternating  /L^-formulas  yields  a 
worst-case  time  complexity  for  the  algorithm,  which  is  exponentially  worse  (in 
the  alternation  depth)  than  the  estimation  given  for  the  alternation-free  case 
[BS92,  BS95]. 

Theorem  3.1  (Complexity). 

LetC  be  a  context-free  system,  and  0  be  a  closed  ^.-formula.  Then  the  luorst-case 
time  complexity  of  solving  the  property  transformer  scheme  is 

4  Dynamic  Environments 

In  the  presence  of  formulas  containing  free  variables  the  simple  composition  prop¬ 
erty  of  Proposition  2.8.2  no  longer  captures  correctly  the  behaviour  of  context- 
free  processes  wrt.  the  specification  at  hand.  This  defect  is  eliminated  by  the 
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slight  modification  given  below. 

{r,V)  aP  {Ay}  iff  3  r,V"  {CV'}  a  {r.V"}  and  {r.V"} /?  {zi,v} 

Intuitively,  the  modified  composition  rule  expresses  that  in  addition  to  assertions 
also  eiivivomiicnts  must  be  adapted  when  considered  at  intermediate  states.  In 
general,  the  valuation  V'  is  obtained  from  V  by  right  cancellation  of  /?,  i.e.  for 
all  X  G  dom{V),  V {X)  =  (V(X)  fl  As  an  example,  ajd  G  V[X)  would 

imply  Q  G  V"{X). 

In  the  remainder  of  this  section  we  fix  now  a  context-free  system  C,  and  a 
formula.  ^  with  closure  Our  aim  is  to  develop  a  formalism,  the 

(lyndniic  enviTon'ni€nts^  which  faithfully  models  the  adaptations  of  valuations 
needed  for  composition.  Dynamic  environments  will  be  partitioned  into  levels 
k  G  [!,?’]  where  a  dynamic  environment  of  level  k  defines  the  valuations  for 
{  . . . ,  }.  This  change  from  valuations  for  variables  to  valuations  for  sub¬ 

formulas  is  reflected  in  the  semanics  by  adding  the  rule  “if  ^  G  dom[V)  then 
{a  [my  if  a  G  V{^)y .  The  original  model-checking  problem  is  then 

reduced  to  a  corresponding  fixpoint  problem  on  the  finite  domain  of  dynamic 
environments,  such  that  the  semantics  of  the  original  formula  is  captured  by  the 
final  environment  of  level  n. 

Definition  4.1  (Dynamic  Environment). 

A  dynaimc  environment  Ak  of  level  k  G  [l,n]  is  a  sequence  of  deterministic 
finite-state  automata  ^  where  =  (2*^  ^ 

are  the  state  sets  of  the  automata,  V  is  the  input  alphabet,  '.  Q^i  ^ 
\/  ^  are  the  transition  functions  obeying  the  constraints  = 

A  implies  (A-i,  A)  -  A_i  where  Ai  denotes  {Ai,...,  A),  and  = 

{Ai  G  Qa,  I  ^  A;}  is  the  set  of  accepting  states.  Denoting  the  transitive 
closure  of  as  usual,  also  by  Sai  the  language  accepted  by  Ai  starting  in  the 
state  A  is  ^AiiA)  ^  {a  eV*  \  (5^.(A,a)  e  Fa,  }  where  a  is  the  reverse  of 

q,2 

A  dynamic  environment  Ak  together  with  a  state  Ak  is  then  interpreted  as 
an  environment  which  defines  valuations  for  , . . . ,  by  means  of 

A\{A\)  =df  [^^1  ^  CaA^i)] 

Ak{Ak)^,^Ak-i{Ak-i)[Fk^CAA^k)]  for  2<k<n 

Dynamic  environments  are  a  convenient  formalism  to  describe  the  semantics 
of  yLf-formulas  on  context-free  processes  since  they  model  compositionality  simply 
by  transitions  in  the  finite  automaton. 

Lemma  4,2.  Let  {  T,  V'  }  A  {A,  Ak{Ak)  }.  Then 

1.  For  all  i  <  k,  Fi  e  F  ijf  A  e  Ak{Ak){Fi),  and  2.  V'  -  Ak{SAk{^k,  A)). 

^  Here  we  have  to  use  d  as  the  automaton  has  to  model  the  above  mentioned  right 
cancellation 


427 


The  first  proper!}^  expresses  that  a  dynamic  environment  of  level  k  captures 
the  semantics  of  all  subformulas  up  to  level  k,  while  the  second  property  states 
that  the  environment  to  be  considered  in  the  pre-condition  of  A  coincides  with 
the  interpretation  of  the  T-successor  of  Ak  in  Ak  • 

The  granularity  of  the  transition  functions  of  dynamic  environments  is  not 
sufficient  to  obtain  a  match  between  the  semantic  and  the  iterative  intuition 
behind  the  model  checking  problem.  We  therefore  split  these  transition  functions 
into  characteristic  transition  functions  as  follows. 

The  split  into  characteristic  transition  functions  allows  us  to  view  a  dynamic 
environment  Ak  as  a  matrix  of  CTF’s  as  depicted  below. 


^k,l  ^k,2  _  ...(5^-" 

This  matrix  can  be  systematically  extended  to  a  matrix  for  ..4a: +i  with  new  row 
^  ^A:+i,n^  means  of  a  fixpoint  computation  such  that  the  final  result 
will  capture  the  semantics  of  the  formula#  on  the  given  process^. 

As  will  be  elaborated  on  in  the  next  subsection,  these  matrices  are  adequate 
for  proving  our  main  result,  Theorem  4.6,  i.e.  the  equivalence  of  the  semantic 
and  the  iterative  algorithm  presented  in  Section  4.1,  because  it  is  possible  to 
“synchronize”  their  corresponding  computations  on  the  diagonal. 

4.1  Semantic  and  Iterative  Solutions 

Given  the  semantics  of  the  formulas  in  terms  of  a  dynamic  en¬ 

vironment  Ak-  i  we  will  now  consider  the  semantics  of  the  remaining  formulas 

Definition  4.3  (Semantic  Solutions). 

We  call  Ak,  for  k  €  [l,n.],  the  semantic  solution  of  .4a_i,  written  as  ^(^a:-!),  if 
the  transition  function  of  Ak  satisfies 

^AMk.A)  =  Fk  iff  (A,A-i(A-i))  A  (Ak,Ak-i(Ak-i}). 

Moreover,  we  call  (.4a- , . .  . , -4n)  the  semantic  solutions  of  .4a-i,  denoted  by 
S(Ak-i),  if  Ai  =  5(A-i),  for  i  e  [k,  n]. 

It  turns  out  that  the  semantic  solution  respects  the  standard  substitution  lemma. 

®  More  precisely,  since  the  arity  of  characteristic  transition  functions  depends  on  the 
row,  they  have  to  be  adapted  as  described  in  [BS97]  during  this  computation. 
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Lemma  4.4.  Let  [rk^Ak-iiLk-i))  A.  {Ak^Ak-\{^k-\))  let  Ak  be  the  se¬ 
mantic  solution  of  Ak^i‘  Then 

{rk,Ak-i{fk-i}[Tk  CAki^k)])  A  {Ak,Ak-i(Ak-i)[Tk  CAki^k)  ]) 

Corollary  4.5  (Diagonal  Consistency).  IfAk,  ■■^,An  are  the  semantic  solu¬ 
tions  of  Ak-i  then  ,  for  i  G  [/i^,  n]^j  G  [1,?^]- 

Due  to  this  corollary  we  may  simply  identify  the  semantic  solutions  Ak,  An 
with  the  characteristic  transition  functions  , . . .  ,6'^'^. 

Let  us  finally  sketch  the  resulting  jconceptual)  _algorithm  which  iteratively 
computes  the  semantic  solutions  for  Ak-i-  Given  .4^-1,  we  would  like  to  com¬ 
pute  for  i  G  [Ar,  n],  j  G  [1,  n].  By  Corollary  4.5  we  already  know  that 
for  i  G  [k,  n]J  E[l,  k-1].  The  remaining  characteristic  transition  functions  are 
then  computed  level- wise  by  a  two-level  fixpoint  computation.  During  the  inner- 
level  computation  we  have  fixed  some  approximant  6'^’^  and  vary  the  values  of 
The  idea  is  that  , . . .  together  with  Ak-i  defines 

a  dynamic  environment  Ak  for  which  we  can  compute  the  semantic  solutions 
0k+i,k+i  ^  _  ^0ri,n  induction.  We  may  therefore  update  . . . ,  by 

^A’+1,A:+1^  repeat  this  iteration  until  we  reach  consistency.  In  the 

outer-level  fixpoint  computation  we  may  now  update  the  fixed  by  evaluat¬ 
ing  the  characteristic  transition  function  for  the  “unfolding”  of  L^k  in  the  current 
setting,  and  start  the  inner  fixpoint  computation  again.  Our  main  theorem 
then  states  that  if  we  have  reached  consistency  also  at  the  outer-level  then  the 
iterative  and  the  semantic  solutions  for  coincide. 

Theorem  4.6.  For  any  given  dynamic  environment  Ak,  the  semantic  and  the 
iterative  solutions  coincides. 

The  observation  that  only  the  characteristic  transition  functions  on  the  diag¬ 
onal  have  to  be  taken  into  account  when  updating  wrt.  the  current  dynamic 
environment  An,  allows  us  to  replace  the  “conceptual”  algorithm  used  in  the 
correctness  proof  to  the  “actual”  model-checking  algorithm  presented  in  Section 
3.  This  optimization  is  the  key  for  proving  the  claimed  complexity  result. 

5  Conclusions  and  Further  Research 

In  this  paper  we  have  presented  an  iterative,  exponential  model-checking  al¬ 
gorithm  for  context-free  processes  which  deals  with  the  full  modal  mu-calculus. 
This  basic  algorithm  can  also  be  extended  to  the  class  of  pushdown  processes 
following  the  lines  of  [BS95],  as  well  as  to  the  class  of  regular  graphs  follow¬ 
ing  the  lines  of  [BQ97],  respectively.  Essentially,  both  extensions  are  obtained 
by  taking  into  account  the  arity  Q  of  pushdown  processes  (i.e.  the  number  of 
states  in  the  finite  control),  respectively  regular  graphs  (i.e.  the  maximal  arity  of 
an  hyperedge),  which  yields  characteristic  property  transformers  with  multiple 
arguments.  For  these  extensions  our  algorithm  has  the  worst-time  complexity 

0{  1^1  *  (IQI  *  ). 


429 


Recently,  Walukiewicz  presented  another  model-checker  for  pushdown  pro¬ 
cesses  which  uses  games  [Wal96].  His  algorithm  has  the  different  complexity 
estimation  0(  \C\  +  )  and  behaves  hence  worse  for  increasing 

degrees  of  alternation  depths. 

Since  our  algorithm  directly  mimics  the  behavioural  intuition  behind  sequen¬ 
tial  processes  and,  in  particular,  keeps  process  and  formula  structure  transparent, 
it  gives  a  direct  handle  to  extending  the  underlying  process  structure.  Intended 
future  work  includes  plans  to  extend  model-checking  to  the  class  of  rational  re¬ 
stricted  recognizable  graphs  as  introduced  in  [Cau96],  and  second,  to  develop 
a  local  variant.  Both  extensions  will  exploit  the  structural  transparency  of  our 
approach  and,  in  particular,  use  the  framework  of  dynamic  environments. 
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Abstract.  We  introduce  a  symbolic  model  checking  procedure  for  Probabilistic 
Computation  Tree  Logic  PCTL  over  labelled  Markov  chains  as  models.  Model 
checking  for  probabilistic  logics  typically  involves  solving  linear  equation  sys¬ 
tems  in  order  to  ascertain  the  probability  of  a  given  formula  holding  in  a  state. 
Our  algorithm  is  based  on  the  idea  of  representing  the  matrices  used  in  the  lin¬ 
ear  equation  systems  by  Multi -Terminal  Binary  Decision  Diagrams  (MTBDDs) 
introduced  in  Clarke  et  al{\ A].  Our  procedure,  based  on  the  algorithm  used  by 
Hansson  and  Jonsson  [24],  uses  BDDs  to  represent  formulas  and  MTBDDs  to 
represent  Markov  chains,  and  is  efficient  because  it  avoids  explicit  state  space 
construction.  A  PCTL  model  checker  is  being  implemented  in  Verus  [9]. 


1  Introduction 

Probabilistic  techniques,  and  in  particular  probabilistic  logics,  have  proved  successful 
in  the  specification  and  verification  of  systems  that  exhibit  uncertainty,  such  as  fault- 
tolerant  systems,  randomized  distributed  systems  and  communication  protocols.  Mod¬ 
els  for  such  systems  are  variants  of  probabilistic  automata  (such  as  labelled  Markov 
chains  used  in  e.g.  [24,  34,  35,  17]),  in  which  the  usual  (boolean)  transition  relation 
is  replaced  with  its  probabilistic  version  given  in  the  form  of  a  Markov  probability 
transition  matrix.  The  probabilistic  logics  are  typically  obtained  by  “lifting”  a  non- 
probabilistic  logic  to  the  probabilistic  case  by  constructing  for  each  formula  (j)  and  a 
real  number  p  in  the  [0,  l]-interval  the  formula  [4)]>p  in  which  p  acts  as  a  threshold  for 
truth  in  the  sense  that  for  the  formula  [<y6]>p  to  be  satisfied  (in  the  state  s)  the  proba¬ 
bility  that  0  holds  in  s  must  be  at  least  p  (s&q  [26,  32,  25]  for  a  different  approach). 
With  such  logics  one  can  express  quantitative  properties  such  as  “the  probability  of 
the  message  being  delivered  within  t  time  steps  is  at  least  0.75”  (see  e.g.  the  timing  or 
average-case  analysis  of  real-time  or  randomized  distributed  systems  [24,  23,  5,  6,  2]) 
or  (the  more  prevalent)  qualitative  properties,  for  which  0  is  required  to  be  satisfied  by 
almost  all  executions  (which  amounts  to  showing  that  0  is  satisfied  with  probability  1, 
see  e.g.  [1,  17,  23,  24,  21,  22,  29,  30,  34]). 

*  This  research  was  sponsored  in  part  by  the  National  Science  Foundation  under  grant  no.  CCR- 
8722633,  by  the  Semiconductor  Research  Corporation  under  contract  92-DJ-294,  and  by  the 
Wright  Laboratory,  Aeronautical  Systems  Center,  Air  Force  Materiel  Command,  USAF,  the 
Advanced  Research  Projects  Agency  (ARPA)  under  grant  F3361 5-93- 1-1 330. 

**  This  research  was  sponsored  in  part  by  the  European  Union  ESPRIT  projects  ASPIRE  and 
FlREworks,  British  Telecom,  and  the  Nuffield  Foundation. 
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Much  has  been  published  concerning  the  verification  methods  for  probabilistic  log¬ 
ics.  Probabilistic  extensions  of  dynamic  logic  [26]  and  temporal  and  modal  logics, 
e.g.  [2,  6,  17, 24,  21,  27,  30,  31,  34],  and  automatic  procedures  for  checking  satisfaction 
for  such  logics  have  been  proposed.  The  latter  are  based  on  reducing  the  calculation  of 
the  probability  of  formulas  being  satisfied  to  a  linear  algebra  problem:  for  example,  in 
[24],  the  calculation  of  the  probability  of  ‘until’  formulas  is  based  on  solving  the  linear 
equation  system  given  by  an  n  x  n  matrix  where  n  is  the  size  of  the  state  space.  Optimal 
methods  are  known  (for  sequential  Markov  chains,  the  lower  bound  is  single  exponen¬ 
tial  in  the  size  of  the  formula  and  polynomial  in  the  size  of  the  Markov  chain  [18]), 
but  these  atgorithms  are  not  of  much  practical  use  when  verifying  realistic  systems.  As 
a  result,  efficiency  of  probabilistic  analysis  lags  behind  efficient  model  checking  tech¬ 
niques  for  conventional  logics,  such  as  symbolic  model  checking  [11,  12,  10,  8, 15,  28], 
for  which  tools  capable  of  tackling  industrial  scale  applications  are  available  (cf.  smv). 
This  is  undesirable  as  probabilistic  approaches  allow  one  to  establish  that  certain  prop¬ 
erties  hold  (in  some  meaningful  probabilistic  sense)  where  conventional  model  checkers 
fail,  either  because  the  property  simply  is  not  true  in  the  state  (but  holds  in  that  state 
with  some  acceptable  probability),  or  because  exhaustive  search  of  only  a  portion  of  the 
system  is  feasible. 

The  main  difficulty  with  current  probabilistic  model  checking  is  the  need  to  inte¬ 
grate  a  linear  algebra  package  with  a  conventional  model  checker.  Despite  the  power  of 
existing  linear  algebra  packages,  this  can  lead  to  inefficient  and  time  consuming  com¬ 
putation  through  the  implicit  requirement  for  the  construction  of  the  state  space.  This 
paper  proposes  an  alternative,  which  is  based  on  expressing  the  probability  calculations 
in  terms  of  Multi-Terminal  Binary  Decision  Diagrams  (MTBDDs)  [16].  MTBDDs  are 
a  generalization  of  (ordered)  BDDs  in  the  sense  that  they  allow  arbitrary  real  numbers 
in  the  terminal  nodes  instead  of  just  0  and  1,  and  so  can  provide  a  compact  representa¬ 
tion  for  matrices.  As  a  matter  of  fact,  in  [13]  MTBDDs  have  been  shown  to  perform  no 
worse  than  sparse  matrices.  Thus,  converting  to  MTBDDs  ensures  smooth  integration 
with  a  symbolic  model  checker  such  as  smv  and  has  the  potential  to  outperform  sparse 
matrices  due  to  the  compactness  of  the  representation,  in  the  same  way  as  BDDs  have 
outperformed  other  methods.  As  with  BDDs,  the  precise  time  complexity  estimates  of 
model  checking  for  MTBDDs  are  difficult  to  obtain,  but  the  success  of  BDDs  in  practice 
[8,  28]  serves  as  sufficient  encouragement  to  develop  the  foundations  of  MTBDD-based 
probabilistic  model  checkers. 

In  this  paper  we  consider  a  probabilistic  extension  of  CTL  called  Probabilistic  Com¬ 
putation  Tree  Logic  (PCTL),  and  give  a  symbolic  model  checking  procedure  which 
avoids  the  explicit  construction  of  the  state  space.  We  use  finite-state  labelled  Markov 
chains  as  models.  The  model  checking  procedure  is  based  on  that  of  [24,  18],  but  we 
use  BDDs  to  represent  the  boolean  formulas,  and  a  suitable  combination  of  BDDs  and 
MTBDDs  for  probabilistic  formulas.  Currently,  we  are  implementing  the  PCTL  sym¬ 
bolic  model  checking  in  Verus  [9].  For  reasons  of  space  we  omit  much  detail  from  this 
paper,  which  will  be  reported  in  [4].  We  assume  some  familiarity  with  BDDs,  automata 
on  infinite  sequences,  probability  and  measure  theory  [8,  33,  20]. 
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2  Labelled  Markov  chains 

We  use  discrete  time  Markov  chains  as  models  (we  do  not  consider  nondeterminism). 
Let  AP  denote  a  finite  set  of  atomic  propositions.  A  labelled  Markov  chain  over  a  set 
of  atomic  propositions  AP  is  a  tuple  M  =  (S',  P,  L)  where  S'  is  a  finite  set  of  states, 
P  :  S'  X  5  [0,1]  a  transition  matrix,  i.e.  Yltes  ~  ^  s  e  S, 

and  L  :  5  2^^  a  labelling  function  which  assigns  to  each  state  5  G  5  a  set  of 

atomic  propositions.  We  assume  that  there  are  2^  states  for  some  n,  and  that  there  are 
sufficiently  many  atomic  propositions  to  distinguish  them  (i.e.  L{s)  ^  L{P)  for  all 
states  s,  s'  with  s  s').  Any  labelled  Markov  chain  may  be  transformed  into  one 
satisfying  these  conditions  by  adding  dummy  states  and  new  propositions. 

Execution  sequences  arise  by  resolving  the  probabilistic  choices.  Formally,  an  ex¬ 
ecution  sequence  in  il<f  is  a  nonempty  (finite  or  infinite)  sequence  tt  =  50^1^2)  ■ 

where  s^  are  states  and  P(si_i,  Si)  >  0,  2  -  1, 2, . . ..  The  first  state  of  tt  is  denoted 
by  first{'K).  7r(A:)  denotes  the  k  +  1-th  state  of  tt.  An  execution  sequence  tt  is  also 
called  a  path,  and  a/w//  path  iff  it  is  infinite.  Path^{s)  is  the  set  of  full  paths  tt  with 
firstf-K)  ^  s.  For  s  e  S,  let  S{s)  be  the  smallest  a-algebra  on  Path^{s)  which 
contains  the  basic  cylinders  {tt  G  Path^{s)  :  p  is  a  prefix  of  tt}  where  p  ranges  over 
all  finite  execution  sequences  starting  in  s.  The  probability  measure  Prob  on  i;(5)  is 
the  unique  measure  with  Prob  {  tt  G  Path^(s)  :  p  is  a  prefix  of  tt  }  —  where 

P(5o5i  ■  •  ■  Sk)  —  P(so,  Si)  •  P(si ,  S2)  ’  .  .  ■  •  P(sA;-~i  ,Sk)- 


Example  I.  We  consider  a  simple  communication  protocol  similar  to  that  in  [24].  The 
system  consists  of  three  entities:  a  sender,  a  medium  and  a  receiver.  The  sender  sends 
a  message  to  the  medium,  which  in  turn  tries  to  deliver  the  message  to  the  receiver. 
With  probability  the  messages  get  lost,  in  which  case  the  medium  tries  again  to 
deliver  the  message.  With  probability  the  message  is  corrupted  (but  delivered);  with 
probability  the  correct  message  is  delivered.  When  the  (correct  or  faulty)  message 
is  delivered  the  receiver  acknowledges  the  receipt  of  the  message.  For  simplicity,  we 
assume  that  the  acknowledgement  cannot  be  corrupted  or  lost.  We  describe  the  system 
in  a  simplified  way  where  we  omit  all  irrelevant  states  (e.g.  the  state  where  the  receiver 
acknowledges  the  receipt  of  the  correct  message). 

We  use  the  following  four  states: 

Sinit  the  state  in  which  the  sender  passes  the  message 
to  the  medium 

Sdei  the  state  in  which  the  medium  tries  to  deliver  the 
message 

siost  the  state  reached  when  the  message  is  lost 
Serror  the  State  reached  when  the  message  is  corrupted 
The  transition  Sdei.  sinit  stands  for  the  acknowledgement  of  the  receipt  of  the  correct 
message,  s error  Sinit  for  the  acknowledgement  of  the  receipt  of  the  corrupted  mes¬ 
sage.  We  use  two  atomic  propositions  ai,  a2  and  the  labelling  function  L{sinit)  ” 

L{Sdel)  =  L/{siost)  =  {0^2}^  L{s error)  =  {ui}.® 
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3  Probabilistic  branching  time  temporal  logic 

In  this  section  we  present  the  syntax  and  semantics  of  the  logic  PCTL  (Probabilistic 
Computation  Tree  Logic)  introduced  by  Hansson  &  Jonsson  [24]^.  PCTL  is  a  proba¬ 
bilistic  extension  of  CTL  which  allows  one  to  express  quantitative  properties  of  proba¬ 
bilistic  processes  such  as  “the  system  terminates  with  probability  at  least  0.75”.  PCTL 
contains  atomic  propositions  and  the  operators:  next-step  X  and  until  U.  The  operators 
X  and  U  are  used  in  connection  with  an  interval  of  probabilities.  The  syntax  of  PCTL 
is  as  follows: 

^  tt  \  a  \  A^2  \  \  [X^]3p  \  [^iU^2]3p 

where  a  is  an  atomic  proposition,  p  e  [0, 1],  □  is  either  >  or  >.  Formulas  of  the 
form  X^  or  ^iU^2,  where  are  PCTL  formulas,  are  called  path  formulas. 

PCTL  formulas  are  interpreted  over  the  states  of  a  labelled  Markov  chain,  whereas  path 
formulas  are  interpreted  over  paths.  The  subscript  □  p  denotes  that  the  probability  of 
paths  starting  in  the  current  state  fulfilling  the  path  formula  is  □  p.  Thus,  PCTL  is  like 
CTL,  except  that  the  path  operators  A  and  E  in  CTL  have  been  replaced  by  the  operator 
[  •  The  usual  derived  constants  and  operators  are:  ff  —  -iff,  V  #2  =  A 

^^2),  ^1  ^2  =  V  ^2-  Operators  for  modelling  “eventually”  or  “always”  can 

be  derived  by:  [0^]>p  =  [ttU^]>p,  and  similarly  for  [•]>p. 

Let  M  =  (5,  P,L)  be  a  labelled  Markov  chain.  The  satisfaction  relation  |=  C 
S  X  PCTL  is  given  by 

s  1=  tf  for  all  s  G  5  s  A  ^2  iff  5  |=  ^1  and  s  |=  ^2 
s  1=  a  iff  a  £  L{s)  s  \=  iff  s  ^  ^ 
s  1=  [XT]3p  \iiProb{'K  G  Path^^[s)  :  tt  \=  X^}  □  p 
s  1=  [^1 17^2] Dp  iff  Prob{7T  e  Path^^{s)  :  tt  \=  ^lU ^2}  ^  P 

TT  \=  X$  iff  7r(l)  1^  ^ 

TT  \=  ^iU^2  iff  there  exists  A:  >  0  with  7r{i)  ^  ^1,  =  0, 1, . . . ,  A;  —  1  and  7r{k)  ^  ^2- 
For  a  path  formula  /  the  set  {tt  G  Pathuj{s)  :  tt  1==  /}  is  measurable  [34,  18].  If  s  \=  ^ 
then  we  say  5  satisfies  ^  (or  holds  in  s).  The  truth  value  of  formulas  involving  the 
linear  time  quantifiers  O  and  □  can  be  derived: 

S  \=  [O^Jdp  iff  Prob{'K  G  Path^{s)  :  7r(A:)  |=  ^  for  some  A:  >  0}  □  p 
s  1=  Jup  iff  PTob{'n:  G  Path^{s)  :  7r(A:)  1=  ^  for  all  A:  >  0}  □  p. 

Given  a  probabilistic  process  P,  described  by  a  labelled  Markov  chain  M  =  (5,  P,  L) 
with  an  initial  state  5,  we  say  V  satisfies  a  PCTL  formula  ^  iff  s  \=  For  instance,  if 
a  is  an  atomic  proposition  which  stands  for  termination  and  V  satisfies  [Oa]>p  then  V 
terminates  with  probability  at  least  p. 


4  Multi-terminal  binary  decision  diagrams 

Ordered  Binary  Decision  Diagrams  (BDDs)  [7,  8,  15,  28]  are  a  compact  representation 
of  boolean  functions  /  :  {0,  ->  {0,1}.  They  are  based  on  the  canonical  represen¬ 

tation  of  the  binary  tree  of  the  function  as  a  directed  graph  obtained  through  folding 

^  For  simplicity  we  omit  the  bounded  ‘until’  operator  of  [24]. 
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internal  nodes  representing  identical  subfunctions  (subject  to  an  ordering  of  the  vari¬ 
ables  to  guarantee  uniqueness  of  the  representation)  and  using  0  and  1  as  leaves.  In  [16] 
it  is  shown  how  one  can  generalize  BDDs  to  cogently  and  efficiently  represent  matrices 
in  terms  of  so-called  multi-terminal  binary  decision  diagrams  (MTBDDs). 

Formally,  MTBDDs  can  be  defined  as  follows.  Let  a:i , . . . ,  be  distinct  variables, 
which  we  order  by  Xi  <  xj  iff  i  <  j.  A  multi-terminal  binary  decision  diagram 
(MTBDD)  over  {xi,...,Xn)  is  a  rooted,  directed  graph  with  vertex  set  V  contain¬ 
ing  two  types  of  vertices,  nonterminal  and  terminal.  Each  nonterminal  vertex  t;  is  la¬ 
belled  by  a  variable  var{v)  £  ,  Xn}  and  two  children  left{v),  right(v)  e  V. 

Each  terminal  vertex  v  is  labelled  by  a  real  number  value{v).  For  each  nonterminal 
node  V,  we  require  var{v)  <  var{left{v))  if  left{v)  is  nonterminal,  and  similarly, 
var(v)  <  var{right(v))  if  right{v)  is  nonterminal.  A  suitable  adaptation  of  the  op¬ 
erator  REDUCE^')  [7]  yields  an  operator  which  accepts  an  MTBDD  as  its  input  and 
returns  the  corresponding  reduced  MTBDD. 

Each  MTBDD  Q  over  {xi,..  .,Xn}  represents  a  function  Fq  :  {0, 1}^  ->  M, 
and,  vice  versa,  each  function  F  :  {0,  M  can  be  described  by  a  unique  reduced 

MTBDD  over  (a;i , . . . ,  In  the  sequel,  by  the  MTBDD  for  a  function  F  :  {0, 1}^  ^ 
R  we  mean  the  unique  reduced  MTBDD  Q  with  Fq  =  F.  If  all  terminal  vertices  are 
labelled  by  0  or  1 ,  i.e.  if  the  associated  function  Fq  is  a  boolean  function,  the  MTBDD 
specializes  to  a  BDD  over  (xi , . . . , 

MTBDDs  are  used  to  represent  D-valued  matrices  as  follows.  Consider  a  2”^  x  2”^~ 
matrix  A.  Its  elements  aij  can  be  viewed  as  the  values  of  a  function  fA  ■  {!,-••  2”^}  x 
{!,...  2^]  ->  D,  where  fA{iJ)  =  o^ij.  Using  the  standard  encoding  c  :  {0, 1}”^ 

{1, ...  2"^}  of  boolean  sequences  of  length  m  into  the  integers,  this  function  may  be 
interpreted  as  a  D-valued  boolean  function  /  :  {0,1}""  D  where  f{x,y)  = 
fA{c{x),c{ij))  for  X  =  {xi...  Xm)  and  y  =  {y^  . . .  t/^).  This  transformation  now  al¬ 
lows  matrices  to  be  represented  as  MTBDDs.  In  order  to  obtain  an  efficient  MTBDD- 
representation,  the  variables  of  /  are  permuted.  Instead  of  the  MTBDD  for  f{xi  ... 

•  ym),  we  use  the  MTBDD  obtained  from  f{xi,yi,X2,y2,-  This 

convention  imposes  a  recursive  structure  on  the  matrix  from  which  efficient  recursive 
algorithms  for  all  standard  matrix  operations  are  derived  [16]. 


4.1  Representing  labelled  Markov  chains  by  MTBDDs 

To  represent  the  transition  matrix  of  a  labelled  Markov  chain  by  a  MTBDD  we  abstract 
from  the  names  of  states  and  instead,  similarly  to  [8,  15],  use  binary  tuples  of  atomic 
propositions  that  are  true  in  the  state.  Let  M  =  (5,  P,  L)  be  a  labelled  Markov  chain. 
We  fix  an  enumeration  ai, . . . ,  a„  of  the  atomic  propositions  and  identify  each  state  s 
with  the  boolean  n-tuple  e{s)  -  (&i, . . . ,  &n)  where  bi  =  1  iff  ai  £  L{s).  In  what  fol¬ 
lows,  we  identify  P  with  the  function  F  :  (0, 1}^"  — >•  [0, 1],  F{xi,yi, . . .  ^Xn^y-n)  — 
P((a:i, . . .  (yij  •  •  •  ,yn)).  and  represent  M  by  the  MTBDD  for  P  over  {xi,yi, 
...  ,Xn,  yn)-  The  associated  MTBDD  is  denoted  by  P. 

Example!.  For  the  system  in  Example  1  we  use  the  encoding  e{sinit)  ”  00>  ^{sdei)  = 
11,  e{siost)  =  01  e{serror)  =  10.  The  values  of  the  matrix  P,  the  function  F  and  the 
MTBDD  P  for  F  are  are  given  by: 
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00 

01 

10 

11 

00 

0 

0 

0 

1 

01 

0 

0 

0 

1 

10 

1 

0 

0 

0 

11 

98 

100 

1 

100 

1 

100 

0 

F{xi,yi,X2,y2)  = 


'1  \\fx1y1X2y2 

^  :  ifxiyiX2y2 
^  :  ifxiyiX2y2 
^  0  :  otherwise. 


G  {0101,0111,1000} 
G  {1011,1110} 

=  1010 


(The  thick  lines  stand  for  the  “right”  edges,  the  thin  lines  for  the  “left”  edges.)  ■ 


4.2  Operators  on  MTBDDs 

Our  model  checking  algorithm  makes  use  of  several  operators  on  MTBDDs  proposed 
in  Bryant  [7]  and  Clarke  et  al  [14].  We  briefly  describe  them  below. 

Operator  BDD(‘):  takes  an  MTBDD  Q  and  an  interval  /,  and  returns  the  BDD  rep¬ 
resenting  the  function  F{x)  =  1  if  Fq{x)  G  I,  else  F{x)  =  0.  We  obtain  B  = 
BDD{Q,  I)  from  Q  by  changing  the  values  of  the  terminal  vertices  (into  1  or  0  de¬ 
pending  on  whether  or  not  value{v)  G  I)  and  applying  Bryant’s  reduction  procedure 
REDUCEi^).  We  write  BDD{Q,  >  p)  rather  than  BDD{Q,]p,  oo[)  and  BDD{Q,  > 
p)  rather  than  BDD{Q,  [p,  oo[). 

Operator  APPLY (•) :  allows  elementwise  application  of  the  binary  operator  op  to  two 
MTBDDs.  If  op  is  a  binary  operator  on  reals  (e.g.  multiplication  *  or  minus  -)  and  Qi, 
Q2  are  MTBDDs  over  x  then  APPLY{Qi,Q2,op)  yields  a  MTBDD  over  x  which 
represents  the  function  f{x)  =  {x)  op  {x). 

Operator  COMPOSE^(-):  This  operator  allows  the  composition  of  a  real  function 
F  :  {0,  l}"^"^^  IR  and  boolean  functions  Gi  :  {0,  l}’^  — {0, 1},  z  ==  1, . . . ,  /c  giving 
H{x)=F{x,G,{x),...,Gk{x)). 

Matrix  and  vector  operators:  The  standard  operations  on  matrices  and  vectors  have 
corresponding  operations  on  the  MTBDDs  that  represent  them  [13].  If  MTBDDs  A 
and  Q  over  2n  and  n  variables  represent  the  matrix  A  and  vector  q  respectively,  then 
MV.MULTI{A,  Q)  denotes  the  MTBDD  over  n  variables  that  represents  the  vector 

A  q. 

Operator  SOLVE{-):  [8]  presents  a  method  to  decompose  a  regular  matrix  A  into  a 
lower  and  upper  triangular  matrices  and  a  permutation  matrix.  Using  this  LU-decompo- 
sition  we  can  obtain  an  operator  SOLVE{A,  Q)  that  takes  as  its  input  a  MTBDD  A 
over  2n  variables  where  the  corresponding  matrix  A  is  regular  and  a  MTBDD  Q  over  n 
variables  which  represents  a  vector  q,  and  returns  a  MTBDD  Q'  over  n  variables  which 
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represents  the  unique  solution  of  the  linear  equation  system  A  •  x  —  q.  Alternatively, 
we  can  use  iterative  techniques  to  solve  the  equations;  our  experiments  indicate  that  this 
performs  better. 

4.3  Description  of  (MT)BDDs  by  relational  terms  of  the  /x-calculus 

We  will  use  the  //-calculus  as  a  notation  for  describing  (MT)BDDs.  In  the  algorithm 
in  the  next  section,  all  our  (MT)BDDs  are  either  over  2n  variables  (in  which  case  they 
represent  x  matrices),  or  over  n  variables  (in  which  case  they  represent  vectors  of 
length  2'^).  For  example,  if  B,  C  are  BDDs  over  n  variables  and  u  =  (ui, . . . ,  Un), 
V  -  (vi,.,  .,Vn),  then  D  =  Xuv  [B(u)  A  ^{F)]  is  a  BDD  over  2n  variables;  if 
B,C  represent  the  vectors  {bi)i<i<n  and  {ci)i<i<n  respectively,  then  D  represents 
the  matrix  whose  element  in  the  /th  row  and  jth  column  is  bi  A  cj.  The  BDD  E  = 
Xu  [B{u)  A  C{u)]  is  a  BDD  over  n  variables,  representing  the  vector  (bi  A  Ci)i<i<n. 

We  write  TRUE  for  the  BDD  over  n  variables  which  returns  1  in  all  cases  of  its 
arguments.  We  write  ~^B  instead  of  Xx[-^B(x)],  and  Bi  A  B2  for  the  BDD  Xx[Bi  (re)  A 
B2{x)]Afx  =  {xi,...,Xn),y  =  (2/i,...,2/n)  then  ^  y  abbreviates  the  formula 

^  Vi)- 

We  require  one  further  operator.  If  the  labelled  Markov  chain  M  =  (5,  P,  L)  is  rep¬ 
resented  by  a  MTBDD  P  as  described  in  Section  4.1,  and  Bi,  B2  are  BDDs  that  repre¬ 
sent  the  characteristic  functions  of  subsets  5i ,  S2  of  S,  then  REACH (Bi ,  B2 ,  BDD(P, 
>  0))  represents  the  set  of  states  s  e  S  from  which  there  exists  an  execution  sequence 
s  =  5o,  5i, . . . ,  Sfc  with  k  >  0  and  5o, . . . ,  Sk-i  e  5i,  sjt  G  52,  and  which  is  used  in 
the  operator  U  NT  I  L{’)  defined  in  Section  5. 

Operator  REACH{')  Let  Bi,  B2  be  BDDs  with  n  variables  and  T  a  BDD  with  2n 
variables.  We  define  REACH (Bi,B2,T)  to  be  the  BDD  over  n  variables  which  is 
given  by  the  //-calculus  formula  /iZ  Xx  [B2  (^)  V  (Bi  (x)  A  3y[Z (y)  AT{x,  y)] )] .  This 
operator  uses  the  method  of  [8]  to  obtain  the  BDD  for  a  term  involving  the  least  fixed 
point  operator//. 

5  Model  checking  for  PCTL 

Our  model  checking  algorithm  for  PCTL  is  based  on  established  BDD  techniques 
(i.e.  converting  boolean  formulas  to  their  BDD  representation),  which  it  combines  with 
a  new  method,  namely  expressing  the  probability  calculation  for  the  probabilistic  for¬ 
mulas  in  terms  of  MTBDDs.  In  the  case  of  [X^]^p  the  probability  is  calculated  by 
multiplying  the  transition  matrix  by  the  boolean  vector  set  to  1  iff  the  state  satisfies 
whereas  for  [^iC/^2]dp  we  derive  an  operator  called  UNTIL(-),  based  on  [24],  which 
we  express  in  terms  of  MTBDDs. 

Let  M  =  (5,  P,  L)  be  a  labelled  Markov  chain  which  is  represented  by  a  MTBDD 
P  over  2n  variables  as  described  in  Section  4.1.  For  each  PCTL  formula  we  define 
a  BDD  B[^]  over  x  ~  (xi,. . . ,  Xn)  that  represents  Sat{^)  =  {s  €  S  :  s  \=  ^}.  We 
compute  the  BDD  representation  B[^]  of  a  PCTL  formula  ^  by  structural  induction: 
B[tt]  =  TRUE  B[ai]  =  Xx[xi] 

B[—i^]  —  ~^B[^]  B[^i  A  ^2]  —  -^[^1]  ^  B[02] 
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B[  [X#l3p  ]  =  BDD  (  MVMULTI{P,  B[$]),  □  p ) 

B[[fiU^2hp]  =  BDD{UNTIL(B[f,],B[$2],P),^p)) 

The  operator  UNTIL(B[$i],B[(^2],P)  assigns  to  each  state  s  e  S'  the  probability 
of  the  set  of  full  paths  from  s  satisfying  formally,  it  represents  the  function 

S  — >  [0, 1],  s  1-^  Ps,  where  ps  =  Proh  {it  €  Pathu(s)  :  tr  |=  ■  Onr  method 

for  computing  ps  is  based  on  the  partition  of  S  introduced  in  [24,  18],  but  we  must 
compute  with  BDDs.  We  first  compute  the  set  K  =  {s  e  S  :  Ps  >  0}  and  then  set 
y'  =  V  \  Saf(#2).  We  then  have:  p*  =  1  if  s  |=  ^2;  Ps  =  0  if  s  0  V;  and  for  the 
remaining  cases  (i.e.  those  such  that  s  e  V') 

tev  tGSat{^2)  t&s\v 

In  the  second  term,  each  pt  =  1  and  in  the  third  term,  each  pt  =  0.  Therefore  Ps 
(s  e  V)  satisfies  a  |y'|-dimensional  equation  system  of  the  form  x  =  Ax  +  b,  or 
equivalently  (I  -  A)  x  =  b  where  I  is  the  x  \V'\  identity  matrix.  One  can  show 
this  system  has  a  unique  solution  using  the  method  in  [24,  18]. 

We  now  demonstrate  how  UNTIL{-)  can  be  expressed  in  terms  of  MTBDDs.  Let 
Bi  =  —  1, 2.  The  set  y  is  given  by  the  BDD  =  REACH{Bi,B2,BDD{P, 

>  0)),  y'  by  B'  =  Xx  [B{x)  A  ^B2{x)].  In  order  to  avoid  the  BDD  for  the  “new” 
transition  matrix  A  with  [log2  ly^ll  variables,  we  instead  reformulate  the  equation  in 
terms  of  the  matrix  P'  =  {Ps,t)s,tes  which  is  given  by:  Pg  ^  =  P(s,t)  if  s,t  e  V  and 
^  =  0  in  all  other  cases.  The  MTBDD  P'  for  P'  can  be  obtained  from  the  MTBDD 
P  representing  the  Markov  transition  matrix.  The  following  lemma  shows  that  I  -  P' 
is  regular  (we  omit  the  proof). 

Lemma  1.  Let  V\  P'.  I  be  as  as  above.  Then,  I  -  P'  is  regular.  The  unique  solution 
X  —  {xs)ses  of  the  linear  equation  system  (I  —  P^)  •  x  =  q  where  q  =  (^s),  Qs  — 
Y.teSat{^2)  satisfies:  Xs  =  Ps  if  s  ^  y'. 

The  algorithm  for  the  operator  UNTIL{f  is  shown  in  Figure  1.  It  first  calculates  the 
MTBDDs  B  and  P',  for  V  and  V'.  B^  is  used  as  a  mask  to  obtain  P'  from  P;  it  sets 
to  0  the  entries  not  corresponding  to  states  in  V' .  We  next  calculate  the  MTBDD  Q 
for  the  vector  q,  and  use  the  operator  SOLVE(')  to  obtain  the  MTBDD  Q'  satisfying 
Fq,  (s)  =  pg  for  all  s  eV.  The  result,  the  MTBDD  Q"  for  the  vector  p  =  ips)ses,  is 
obtained  from  the  MTBDD  for  the  function  F{x)  =  max{  Fb^  Fq>  (x)  ■  Fb'  {x)  } 
which  uses  Q'  for  all  s  and  ensures  that  1  is  returned  as  the  probability  of  the  states 
already  satisfying  ^2- 

Examples.  Let  #  =  [  tryJo-deliver  U  correctly. delivered  ]>o.9  where 

try -to -deliver  =  02  and  correctly  .delivered  —  A  -^02.  We  consider  the  system 
in  Example  1.  Our  algorithm  first  computes  the  BDDs  Pi  for  Sat{try .to .deliver)  — 
{sdehSiost},  P2  for  Sat  (correctly. delivered)  =  and  then  applies  Algo¬ 

rithm  UNTIL{Bi,B2,P).  V  =  {sinit,SdehSiost}  IS  represented  by  the  BDD  P, 
V'  =  {sdei,siost}  by  the  BDD  B'.  Thus,  P^,  P'  and  A  stand  for  the  matrices 
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Algorithm;  UNTIL{Bi^B2,P) 

Input;  A  labelled  Markov  chain  represented  by  a  MTBDD  P  over  2n  variables, 

BDDs  Bi,  B2  over  n  variables 

Output;  MTBDD  X  over  n  variables  which  represents  the  function  that  assigns  to  each 
state  the  probability  of  a  path  from  the  state  reaching  a  J52-state  via  an  execution 
sequence  through  Bi -states 

Method;  B  :=  REACH{Bi,B2,  BDD(P,  >  0));  B'  :=  Ax  [  B{x)  A  -B2(x)  ]; 

B"^  :=  \xiyi  ...XnVn  [B'  {xi, . . .  ,Xn)  A  B'  (yi, . . .  ,yn)]‘, 

P'  :=  APPLY{P,B^,*),  I  :=  Xxiyi  ...Xnyn  [x  =  y]; 

A  :=  APPLV(I,P',-);Q  :=  MV.M[/LTI(P,  B2); 

Q'  ~  SOLVE{A,  Q)-  Q"  :=  APPLY{B2,  APPLY {Q' ,  B',  *),  max); 
_ Rt\x}vn{REDUCE{Q")). _ 

Fig.  1.  Algorithm  UNTIL{Bi,B2,P) 


B2  (viewed  as  a  vector)  is  q2  =  (1,0, 0,0).  Thus,  Q  is  the  MTBDD  for  the  vector 
p  .  q2  -  (0, 0, 1, 0.98).  We  solve  the  linear  equation  system 

/I  00  0\ 

0  1  0  -1  I 

0  0  1  0  I 

\0-tm0  1/ 

which  yields  the  solution  x  =  (0,  §|,  1,  ||)  (represented  by  the  MTBDD  Q’).  More¬ 
over,  the  MTBDD  APPLY{Q',  B',  *)  can  be  identified  with  the  vector  (0,  ||,  0,  ||). 
UNTIL{BuB2,P)  and  the  BDD  B[^]  are  of  the  following  form. 


Thus,  B[^]  represents  the  characteristic  function  for  Sat{$)  =  {sinit,Sdeh  Siost}- ■ 


6  Implementing  PCTL  model  checking 

We  are  integrating  PCTL  symbolic  model  checking  within  Verus  [9],  which  is  a  tool 
specifically  designed  for  the  verification  of  finite-state  real-time  systems.  Verus  has 
been  used  already  to  verify  several  interesting  real-time  systems:  an  aircraft  controller, 
a  medical  monitor,  the  PCI  local  bus,  and  a  robotics  controller.  These  examples  have  not 
been  originally  modeled  using  probabilities.  However,  these  systems  exhibit  behaviors 
which  can  best  be  described  probabilistically.  The  integration  of  PCTL  model  check¬ 
ing  with  Verus  allows  us  to  verify  stochastic  properties  of  these  and  other  interesting 
applications. 
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The  Verus  language  is  an  imperative  language  with  a  syntax  resembling  that  of  the  C 
language  with  additional  special  primitives  to  express  timing  aspects  such  as  deadlines, 
priorities,  and  delays.  An  important  feature  of  Verus  is  the  use  of  the  wait  statement 
to  control  the  passage  of  time.  In  Verus  time  only  passes  when  a  wait  statement  is 
executed:  non-wait  statements  execute  in  zero  time.  This  feature  allows  a  more  accurate 
control  of  time  and  leads  to  models  with  less  states,  since  consecutive  statements  not 
separated  by  a  wai  t  statement  are  compiled  into  a  single  state.  To  describe  probabilistic 
transitions  we  extend  the  Verus  language  with  the  probabilistic  select  statement. 

From  the  Verus  description  of  the  application,  the  tool  generates  automatically  a 
labeled  state-transition  graph  and  the  corresponding  transition  probability  matrix  using 
BDDs  and  MTBDDs  respectively. 

The  first  experimental  results  of  our  PCTL  symbolic  model  checking  implementa¬ 
tion  are  promising:  Farrow’s  Protocol  (which  is  of  a  similar  size  to  Example  1)  can  be 
verified  in  less  than  a  second.  We  have  modeled  a  fault  tolerant  system  [23,  p.  168-171] 
with  three  processors  that  has  about  35000  reachable  states  (out  of  10^  states).  A  safety 
property  of  this  system  took  only  a  few  seconds  to  check.  Next  we  plan  to  evaluate 
how  well  PCTL  symbolic  model  checking  performs  as  a  formal  verification  tool  in  real 
applications  by  modeling  industrial  size  systems. 


7  Concluding  remarks  and  further  directions 

We  have  proposed  a  symbolic  model  checking  procedure  for  the  logic  PCTL  which  we 
are  implementing  using  MTBDDs  in  Verus,  thus  forming  the  basis  of  an  efficient  tool 
for  verifying  probabilistic  systems.  Our  algorithm  can  be  extended  to  cater  for  “bounded 
until”  of  [24]  which  is  useful  in  timing  analysis  of  systems.  We  expect  that  MTBDDs 
can  be  used  to  derive  PCTL*  model  checking  by  applying  the  methods  of  [18].  Like¬ 
wise,  testing  of  probabilistic  bisimulation  and  simulation  [3,  19]  can  be  implemented 
using  MTBDDs.  An  extension  to  the  case  of  infinite  state  systems,  perhaps  by  appropri¬ 
ate  combination  with  induction,  as  well  as  a  generalization  to  allow  non-determinism, 
would  be  desirable. 
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Abstract.  The  expectation  of  the  absolute  value  of  the  difference  be¬ 
tween  the  heights  of  two  random  binary  search  trees  of  n  nodes  is  less 
than  6.25  for  infinitely  many  n.  Given  a  plausible  assumption,  this  ex¬ 
pectation  is  less  than  4.96  for  all  but  a  finite  number  of  values  of  n. 


1  Introduction 

A  binary  search  tree  (BST)  of  n  nodes  is  constructed  from  n  distinct  keys  in 
random  order  by  inserting  each  key  in  turn  into  an  initially  empty  tree  by  the 
familiar  algorithm  which  inserts  a  key  into  an  empty  tree  by  constructing  a  new 
root  node  with  this  key  and  otherwise  inserts  the  key  into  the  left  or  right  subtree 
depending  on  whether  it  is  smaller  or  larger  than  the  key  at  the  root.  Two  other 
equivalent  definitions  are  often  useful  in  considering  the  shape  or  in  particular 
the  height  of  such  a  tree: 

-  A  random  tree  of  n  nodes  is  empty  if  n  is  zero  and  otherwise  consists  of 
a  root  node  and  a  left  subtree  of  I  nodes  and  a  right  subtree  of  n  -  1  -  / 
nodes  where  I  is  an  integer  chosen  uniformly  on  0  . .  .n  —  1;  the  subtrees  are- 
constructed  in  the  same  way,  all  the  random  choices  being  independent. 

-  The  z-th  node  is  inserted  into  the  tree  by  choosing  one  of  the  i  external 
nodes  of  the  tree,  each  with  the  same  probability  1/z,  and  replacing  it  by  a 
new  internal  node.  Hence  we  have  the  important  result  that  the  probability 
of  this  insertion  increasing  the  height  of  the  tree  is  l/i  times  the  number 
of  external  nodes  at  the  deepest  level  containing  any  external  nodes,  or 
alternatively  2/i  times  the  number  of  internal  nodes  at  the  deepest  level 
containing  any  internal  nodes;  we  call  these  internal  nodes  at  the  deepest 
level  critical  nodes. 

We  are  interested  in  the  distribution  of  the  random  variable  h{n)  which  is  the 
height  of  a  tree  constructed  in  any  of  these  ways.  h[n)  is  also  the  stack  depth  used 
by  a  straightforward  version  of  Quicksort  to  sort  n  randomly  ordered  distinct 
values. 

The  mean  value  of  h{n)  is  known  to  be  close  to  clogn  where  c  «  4.3011 
is  the  larger  root  of  c  =  An  upper  bound  of  the  form  (c  -f  o(l))  log  77. 

was  shown  in  [4];  a  lower  bound  of  the  form  (c  ~  o{l))\ogn  was  shown  in  [1]; 
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and  finally  the  height  was  shown  with  high  probability  to  lie  within  bounds 
clog 77  ±  O (log log 77.)  in  [2]. 

Direct  calculation  for  small  to  moderate  values  of  n  and  random  construction 
of  larger  trees  [3]  have  shown  that  the  variance  of  the  height  remains  small  for 
quite  large  n  and  shows  no  sign  of  diverging.  To  date  the  only  explanation  of 
these  results  has  been  an  upper  bound  of  0(log^  log  77)  on  the  variance  in  [2]. 

In  this  paper  we  will  show  that  there  are  indefinitely  large  values  of  n  for 
which  E[\h{n)  —  E[h{n)]\]  is  less  than  6.25.  Although  this  does  not  prove  anything 
about  the  variance  and  does  necessarily  not  apply  to  all  n,  it  suggests  very 
strongly  that  the  distribution  remains  tightly  concentrated  around  its  mean.  If 
we  make  a  simple  and  plausible  assumption  about  the  convergence  of  the  number 
of  critical  nodes,  we  can  both  strengthen  the  bound  and  show  that  the  conclusion 
applies  to  all  sufficiently  large  n. 

In  section  2  we  will  prove  a  very  weak  version  of  the  main  theorem  which  we 
hope  will  illustrate  the  essential  (and  very  simple)  ideas  in  the  clearest  possible 
way.  In  section  3  we  will  give  the  strongest  form  that  we  yet  know  of  the  theorem. 
In  section  4  we  show  briefly  how  the  theorem  can  be  strengthened  further  if  we 
assume  that  the  expected  number  of  critical  nodes  converges.  Finally  in  section 
5  we  sketch  some  directions  for  further  work. 

2  A  weak  upper  bound 

We  take  hi(n)  and  h2{n)  as  two  independent  random  variables  each  distributed 
as  the  height  of  an  n  node  BST.  Let  c  be  the  limit  of  E[h{n)]/ log?7.  Let  e  be  an 
arbitrary  positive  number. 

Theorem  1.  E[\hi{n)  -  h2(n)\]  <  6clog3  -  6  +  e  infinitely  often. 

(It  follows  immediately  that  F/[|/7(n)“£’[/i(?7)]|]  <  6clog3  — 6Te  22A\1  -\- 
e)  infinitely  often.) 

Proof  by  contradiction:  Suppose  the  contrary.  Then  for  large  enough  A, 
jF[|/?.i(??.)  h2{n)\]  >  6c log3  -  6  +  e  for  all  n  greater  than  TV.  Now  choose  a  1/ 

greater  than  TV  such  that  E[h{Ziy)]  <  E[h(i/)]  +  clog3  +  c/6.  (Infinitely  many 
such  ly  exist  since  E[h{n)]/  \ogn  — >  c.) 

Consider  a  random  tree  of  size  and  the  following  algorithm  to  choose  one 
of  its  immediate  subtrees  {L  is  its  larger  immediate  subtree  and  S  its  smaller): 
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if  \L\  >  2iy 
then 

choose  L 

else 

if  the  height  of  S  is  greater  than  that  of  L 
then 

choose  S 

else 

choose  L 

fi 

fi 

Note  that  the  probability  of  taking  the  first  case  is  2/3. 

Consider  the  height  of  the  subtree  chosen: 

In  case  1  we  choose  a  random  subtree  of  size  greater  than  u:  height  >  E[h{n)]; 
For  the  case  of  |L|  <  2i/,  it  is  clear  that  the  mean  height  is  greater  than  it 
would  be  for  two  subtrees  of  size  n,  namely  E[max{hi{n),  h2{n))]  =  E[{hi{n)-\- 
h2{n))/2  +  |/?,i(z/)-/r2(i^)|/2]  which  is  greater  than  +3clog3-3+  e/2 

by  the  choice  of  n. 

Hence  the  expected  height  of  the  subtree  chosen  is  at  least  i?[/i(z/)]  +  clog  3 - 
I  _l_  f/g  making  it  greater  than  £'[/i(3i^)]  —  1  which  is  clearly  impossible  since  the 
maximum  possible  height  for  a  subtree  is  one  less  than  the  height  of  the  tree.D 

3  Three  ways  to  improve  the  bound 

The  argument  giving  the  upper  bound  of  22.417  can  be  strengthened  in  (at  least) 
three  ways: 

-  Choose  a  value  other  than  for  the  size  of  the  tree.  Any  size  greater  than 
2iy  will  give  some  non- trivial  upper  bound. 

-  Consider  not  only  the  immediate  subtrees  of  the  tree  but  possibly  deeper 
subtrees.  As  long  as  a  subtree  has  size  greater  than  2i/  we  can  consider  its 
split  into  two  subtrees,  certain  that  one  of  them  will  have  at  least  n  nodes. 

-  Where  a  subtree  has  size  an  (a  >  1),  we  have  lower  bounded  its  height  by 
that  of  a  tree  of  size  n.  In  fact,  since  every  tree  has  at  least  one  critical  node, 
the  mean  height  of  an  an  node  tree  must  exceed  that  of  an  n  node  tree  by 

au 

at  least  2  ^  1/i  which  we  can  approximate  by  2  log  a  for  large  n. 

i^u+l 

Hence  we  have  a  general  scheme  for  a  method  of  choosing  a  subtree  and  an 
associated  upper  bound: 

Starting  'With  a  tree  of  size  kn,  with  probability  I  -  p  find  a  subtree  of  size 
an  (a>l),  at  a  depth  A  /rom  the  root.  Otherwise  (with  probability  p)  find  two 
disjoint  subtrees  with  sizes  fin  and  ')n  (7?,  7  >  1),  eit  depths  B  and  C  respectively 
fro'rn  the  root;  choose  one  of  these  two  according  to  which  was  higher  at  the. 
moments  when  each  contained  exactly  n  nodes. 
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We  consider  the  expected  value  of  the  depth  of  the  deepest  node  in  the 
subtree  thus  chosen.  If  only  one  subtree  is  found  (probability  I  —  p)  this  is  at 
least  E[A  ‘IXoga  h[p)].  Otherwise  (probability  p)  it  is  the  expected  depth  of 
the  root  of  the  subtree  chosen  plus  the  expected  height  of  that  subtree  when  its 
size  was  u  plus  the  amount  by  which  the  height  has  increased  since  then,  giving  at 
least  (E[(B  -\-C  -]-2\og  (3  -\-2\og^)  l2-\-h{y)  E[\hi{v)  —  h2{i')\/'l]) .  Putting  these 
two  together  we  find  that  the  expected  depth  is  at  least  (1  —  p)[E[A  +  2  log  a + 
/7(/z)])  +  p(^[(5  +  C+21og^+21og7)/2  +  /7(z^)  +  ^[|/7i(7/)-/i2(7/)|/2]).  Since 
this  must  be  no  greater  than  E[h{ki')\  and  we  can  choose  infinitely  many  ly  with 
E[h(ku)]  <  E[h[u)]  -|-  clog  A:  +  e,  we  can  deduce  an  upper  bound,  valid  infinitely 
often  of 

E[\h,{u)-h2{i.)\/2]< 

clogi’/p  -  E[{B  +  C  +  2\ogl3  +  2\og~i)/2]-^E[A  +  2\oga]  +  0(<i). 

If  we  define  a  random  variable  f{n)  as  the  value  of  ^  +  2  log  a  if  only  one 
subtree  is  found  and  (5  +  (7  +  2  log/?  +  2  log7)/2  if  two  subtrees  are  found,  when 
the  scheme  is  applied  to  an  n  node  tree,  we  can  rewrite  this  inequality  as 


E[\h.i{i/)  -  h2{v)\]  <  2(clogA;  -  E[f(ku)])/p  (1) 


Theorem  2.  E[\hi{n)  —  /72(^)|]  <  6.247  infinitely  often. 

Proof:  We  consider  the  particular  instance  of  this  scheme  in  which  we  choose 
a  cut  off  depth  d  and  apply  the  algorithm  choose2[T^  d)  where  choose2  is  defined 
as 


choose 2  ( tree,  depth ); 

if  depth  =  0  then  return  tree  fi; 
if  size  (tree)  <  2/z 
return  tree 
else 

let  L  and  S  be  the  larger  and  smaller  immediate  subtrees  of  tree 
if  size (S)  >  ic 

then  return  {choosel ( L, depth- 1),  choose  1(S, depth- 1)} 
else  return  {choose2(L, depth- 1)} 

fi 

fi 

end; 
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where  choose!  chooses  some  subtree  of  size  at  least  u: 


choosel(tree,  depth); 

if  depth  —  0  then  return  tree  fi; 
if  size(tree)  <  2n  then 
return  tree 

else 

return  choosel{larger  subtree  of  tree,  depth  —  1) 

/?■ 


To  apply  inequality  (1)  we  need  to  know  the  value  of  p  and  that  of  E[f(kn)]. 
Defining  p(n,  depth)  as  the  probability  of  finding  two  subtrees  of  size  at  least  n 
when  starting  with  a  tree  of  size  n  (>  n)  at  depth  levels  above  the  cut  off  level, 
we  obtain 


p(n,  depth)  = 

if  n  <  or  depth  ~  0 
then 

0 

else 

1  -  ‘2/y/n  +  2/y/n  x  E[p{n' ,  depth  -  1)] 

/?■ 

where  the  expectation  is  taken  over  subtrees  of  sizes  n'  from  n  -  u  to  n  -  I 
Further  defining  pp[x,  depth)  as  the  limit  of  depth)  as  u  tends  to  infinity, 
we  find  that 


pp(x,  depth)  — 

if  X  <  2  or  depth  =  0 

then 

0 

else 

l-2/x  +  2/x  /  pp(y,  depth  -  1)  dy 
Jx-l 

/?' 


Similarly  defining  ff{x,  depth)  and  gg(x,  depth)  as  the  limits  as  i/  tends  to  infinity 
of  the  expected  values  of  (root  depth  +  2  log  size)  averaged  over  the  nodes 
returned  by  choose2{T,  depth)  and  choosel(T,  depth)  applied  to  xn  node  trees 
T,  we  obtain 
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ff{x,  depth)  = 

if  X  <  2  or  depth  —  0 

then 

2  *  log(aT) 

else 

-1 

gg(y,  depth  —  1)  dy  +  2/x  /  ff{y,  depth  —  1)  dy 

Jx-l 

/?■ 

where 

gg(x,  depth)  = 

if  X  <2  or  depth  —  0 

then 

2  *  \og(x) 

else 

1  +  2/x  /  gg{y,  depth  -  1)  dy 
Jxl2 

and  lim  E[f[kiy)]  =  ff{k,d) 

giving  an  upper  bound  on  E[\hi{n)  —  h2{n)\]  as  close  as  required  to  2(clog^’  — 
ff{k,d))/pp(k,d). 

The  strongest  bound  yet  found  was  obtained  by  taking  ^  3.9,  d  =  5,  giving 

a  bound  slightly  less  than  6.247.  ^ 

(The  computation  of  //(3.9,5)  and  pp(3.9,5)  was  done  by  Maple  after  defi¬ 
nition  of  multiple  functions  such  as  ff[d,  defined  only  for  [arj  =  i  and  there 
equal  to  ff{x,  d);  this  enables  the  integrals  to  be  written  as  sums  of  integrals  so 
that  no  if  then  else  fi  constructs  remain  in  the  definitions.) 

4  The  Convergence  Hypothesis 

The  result  of  the  previous  section  can  be  strengthened  both  by  replacing  the 
‘infinitely  often’  by  ‘almost  always’  and  by  reducing  the  bound,  if  we  accept  a 
very  plausible  and  empirically  justified  hypothesis. 

Definition:  ec(n)  is  the  expected  number  of  critical  nodes  of  an  n-node  tree. 
Note:  the  probability  that  the  addition  of  the  (n  +  l)-st  node  increases  the 
height  of  the  tree  is  2ec(n)/(n  +  1).  Calculation  of  ec{n)  for  n  up  to  100,000 
and  approximation  by  constructing  random  trees  for  larger  n  both  suggest  that 
ec{n)  is  monotonically  increasing  after  initial  fluctuations  while  n  is  less  than  8. 
(See  [3]  for  methods  of  rapid  construction  of  very  large  random  trees.) 


The  Convergence  Hypothesis:  ec(n)  tends  to  a  limit  as  n  increases. 
Note:  if  this  hypothesis  is  correct  the  limit  must  be  c/2. 
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Theorem  3.  If  the  Convergence  Hypothesis  holds  then  E[\hi{n)-h2{n)\]  <  4.96 
except  for  (possibly)  finitely  many  n. 

(Again,  this  implies  immediately  that  the  same  bound  applies  to  E[\h{n)  — 
E[h(nm). 

Proof. 

-  extending  the  result  to  almost  all  n: 

In  the  proof  of  the  main  theorem,  we  relied  on  the  fact  that  since  E[h[7i)/ log  n] 
tends  to  c,  there  exist  infinitely  many  j/  for  which  E[h{ki')  —  E[h(i^)]  < 
clogk  +  e.  Now  given  the  convergence  hypothesis,  for  any  k  and  e,  by 
choosing  N  large  enough  we  can  guarantee  that  for  all  i'  larger  than  N, 
ec(iy)  <  c/'2  +  c/2  log  A’  so  that  E[h[ku)  —  h(i/)]  <  clog  A  +  c  and  the  argu¬ 
ment  of  section  3  goes  through  unchanged. 

—  reducing  the  bound:  The  proof  in  section  3  used  the  fact  that  a  random 

tree  of  size  av  had  height  at  least  E[h{iy)]-\-2\oga  (and  a  similar  result  for  the 
higher  of  two  trees  of  sizes  /?zz  and  71/).  Given  the  Convergence  Hypothesis, 
provided  we  choose  1/  large  enough,  the  first  of  these  results  remains  true  with 
any  constant  less  than  c  instead  of  the  “2” .  (The  second  does  not  since  the 
higher  of  two  random  trees  is  not  a  random  tree.)  Hence  we  can  replace  the 
“2  log  .if’  by  “(c  -  c)  log  .t”  in  the  definition  of  ff  (but  not  gg)  and  recompute 
the  bound  obtained  by  the  modified  version  of  inequality  (1).  This  time  the 
best  result  obtained  has  d  =  1  and  A  =  2.67,  giving  a  bound  of  just  under 
4.953.  ° 


5  Further  work 

As  has  been  shown,  a  significant  improvement  in  the  main  theorem  would  be 
obtained  if  the  expected  value  of  the  number  of  critical  nodes  was  shown  to 
converge.  Even  showing  that  this  expectation  is  bounded  for  all  n  would  prove 
that  E[\hi{n)  -  Ii2(n)\]  is  bounded  though  it  would  not  directly  give  an  explicit 
bound.  It  seems  extremely  implausible  that  this  expectation  should  oscillate 
unboundedly  but  a  proof  that  it  does  not  do  so  has  not  been  easy  to  find. 

Alternatively,  improving  the  algorithm  for  choosing  a  subtree  could  further 
decrease  the  numerical  value  of  the  bound.  Two  ways  of  doing  this  seem  worth 
exploring:  firstly,  when  the  kv  node  tree  turns  out  to  have  three  or  more  u 
node  subtrees,  a  careful  choice  between  these  should  give  a  deeper  leaf  than  the 
current  choice  between  the  first  two  found;  secondly,  when  two  subtrees  are  found 
with  depths  B  and  C  and  sizes  j3iy  and  7//,  biasing  the  choice  to  the  one  with 
larger  depth  and  size  must  give  a  deeper  leaf  on  average.  Also  there  may  be  other 
parameters  for  which  the  existing  algorithm  gives  a  better  bound.  Unfortunately 
the  computations  are  very  slow  with  d  >  3  so  not  very  many  have  been  done 
(the  computation  of  the  bound  in  theorem  2  took  Maple  a  weekend). 

The  methods  and  results  developed  here  tell  us  nothing  about  the  variance 
of  h{n).  We  continue  to  conjecture  that  this  variance  is  bounded  as  n  goes  to 
infinity. 
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Abstract.  We  present  a  new  master  theorem  for  the  study  of  divide- 
and-conquer  recursive  definitions,  which  improves  the  old  one  in  several 
aspects.  In  particular,  it  provides  more  information,  frees  us  completely 
from  technicalities  like  floors  and  ceiHngs,  and  covers  a  wider  set  of  toll 
functions  and  weight  distributions,  stochastic  recurrences  included. 


1  Introduction 

Let  Fn  denote  a  variable  related  to  some  divide-and-conquer  (d.a.c.,  for  short) 
algorithm  or  data  structure,  such  as  the  number  of  comparisons  made  in  quick¬ 
sort  or  the  number  visited  nodes  while  a  search  in  a  BST,  while  dealing  with  an 
instance  of  size  n.  By  the  recursive  structure  of  the  algorithm  or  data  structure 
it  is  always  possible  to  get  a  recurrence  that  defines  Fn  from  the  values  of  the 
variable  for  instances  of  smaller  size.  From  this  recurrence  it  is  necessary  to  de¬ 
duce  explicit  or  asymptotic  expressions  for  (that  is,  we  need  to  ‘%olve”  the 
recurrence) . 

To  this  end,  we  can  make  use  of  the  (classical)  master  theorem  (see  [1,  5]).  See 
also  [7,  8]  for  several  improved  versions.  It  is  a  set  of  simple  rules  that  provide 
quick  (albeit  partial)  information  on  the  value  of  Fn-  Assume  that  we  have  the 
recurrence 

Fn^tn  +  W-Fs^,  (1) 

where  tn  is  the  toll  function  or  cost  of  the  divide  and  combine  steps  needed  to 
solve  a  problem  of  size  n,  W  is  the  (fixed)  number  of  recursive  calls  at  each 
step,  and  Sn  =  Z  •  n  F  0(1)  is  the  size  of  the  subproblems  to  be  recursively 
solved,  for  some  Q  <  Z  <  1.  Notice  that  expresions  with  floors  and  ceilings  in 
the  argument  of  the  recursive  call,  like  [n/2J,  are  covered  by  the  term  ^(1) 
above.  Let  a  —  —  log^  W.  Then,  the  classical  master  theorem  states  that  the 
solution  to  this  recurrence  is 

{6>(n'^),  if  tn  —  0(n'^)  for  a  <  a; 

0{tn  logn),  if  tn  =  6>(n"log''  n)  for  c  >  0;  (2) 

6>(t„),  if  tn  —  0(n^)  for  a>  a. 


*  This  research  was  supported  by  the  ESPRIT  LTR  Project  ALCOM-IT,  contract 
20244  and  by  a  grant  from  CIRIT  (Comissio  Interdepartamental  de  Recerca  i 
Innovacio  Tecnologica) . 


450 


Notice  that  there  are  two  gaps  for  the  values  of  in  where  we  cannot  use  the 
master  theorem. 

Although  this  theorem  is  sometimes  enough  for  simple  purposes,  it  presents 
some  drawbacks.  For  instance,  consider  the  recurrence 


—  1  + 


L(n-1)/2J 


•5[(n-l)/2J  + 


[(n-l)/2l 


B 


r(n-l)/2l 


if  n  >  2,  (3) 


with  Bq  -0  and  Bi  =  I,  defining  the  expected  number  of  comparisons  during 
a  binary  search  in  an  array  of  size  n,  when  we  search  for  some  key  in  the  array 
chosen  at  random.  It  does  not  follow  the  master  theorem  pattern  utterly,  since  we 
do  not  have  exactly  one  expected  recursive  call  at  each  step  but  1  -  1/n,  since 
the  central  item  in  the  current  search  range  could  be,  by  chance,  the  sought 
item.  In  other  words,  the  number  of  recursive  calls  is  not  constant  but  tends 
to  a  constant.  Despite  this,  we  can  assume  that  the  solution  to  the  recurrence 
Fn  =  1  +  Fn/2  must  be  close  to  Bn  (which  is  true)  and  therefore  deduce  that 
Bn  =  0(logn).  Posterior  reasoning  can  rigorously  prove  that  this  approximation 
does  not  lead  to  a  wrong  answer. 

Much  more  difficulties  presents  the  analysis  of  stochastic  recurrences  like 


5o  -  0,  Sn  =  n-l  +  -^Y2  ifn>I,  (4) 

^  0<A:<n 


defining  the  expected  number  of  comparisons  to  select  the  i-th  of  the  n  keys  of 
an  array  (where  i  is  chosen  at  random)  when  using  Hoare’s  FIND  [3].  Here  we 
would  need  to  make  further  approximations,  which  could  easily  lead  to  wrong 
conclusions. 

The  theorems  presented  in  this  paper  improve  previous  theorems  in  several 
aspects.  On  the  one  hand,  we  will  show  how  technicalities  like  floors  and  ceilings 
will  not  need  to  be  treated  any  more,  not  previously  to  the  analysis  of  the 
recurrence  nor  afterwards.  On  the  other,  recurrences  where  the  asymptotic  sizes 
of  the  subproblems  to  be  recursively  solved  consist  in  a  set  of  several  fixed 
fractions  of  the  original  problem  (this  improvement  was  already  considered  in  [8]) 
and  the  number  of  recursive  calls  to  each  one  tends  to  a  constant  (but  is  not 
constant)  like 

Fn  ==  tn  F  (2  -  l/y/n)Fln/3j  +  +  (1  +  l/?^)F’|-4n/5+ln2  n]  (5) 

for  n  large  enough,  can  be  easily  analysed  through  our  theorems  as  well.  Depend¬ 
ing  on  in ,  we  can  also  deduce  the  constant  of  the  main  term  of  the  solution  (for 
the  basic  recurrence  (1)  this  constant  can  also  be  found;  see  [5],  for  instance). 
Furthermore,  we  will  be  able  to  deal  with  stochastic  recurrences  like  (4),  and  a 
simple  application  of  our  theorems  will  sometimes  yield  several  of  the  main  terms 
of  the  explicit  solution  of  a  recurrence  with  their  corresponding  multiplicative 
factors,  and  not  only  the  growing  order  of  the  dominating  term.  Finally,  we  will 
see  how  the  new  theorems  cover  a  wider  set  of  toll  functions.  In  terms  of  the 
classical  master  theorem,  the  new  ranges  for  are  in  =  C9(n"  log"^  n)  for  c  <  -1, 
tn  =  6>(n"log^n)  for  c  >  -1  and  in  =  f2(n'^)  for  a  >  a,  thus  closing  almost 
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completely  the  gap  between  the  first  and  the  second  case.  The  results  for  the 
first  range  were  first  given  in  [7]. 

Next  sections  are  organized  as  follows.  In  Sections  2  and  3  we  present  the 
two  types  of  recurrences  that  are  more  likely  to  appear  in  practical  situations, 
and  give  a  master  theorem  for  each  one.  Section  4  includes  the  main  results  from 
which  both  theorems  can  be  derived.  Section  5  ends  the  paper  with  some  final 
remarks. 


2  The  Discrete  Master  Theorem 

To  begin  with,  let  us  introduce  the  concept  of  divide- and-conquer  recursive  def¬ 
inition  formally. 

Definition!.  Let  Fn  >  0  be  a  function  defined  for  all  n  >  0.  We  say  that 
T  —  {^>n}o<n<A^)  {^n}n>iV5  j 

is  a  d.a.c.  recursive  definition  of  iff  TV  >  1,  F^  =  for  all  0  <  n  <  iV  and 

Fn=tn+  ^  Wn,kFk  (6) 

0<A;<n 

for  every  n  >  N,  where  tn  >  0  and  Wn,k  >  0* 

The  weight  Wn,k  is  the  (expected)  number  of  recursive  calls  to  the  algorithm  to 
deal  with  a  subproblem  of  size  k  when  the  original  problem  has  size  n,  while 
in  includes  the  cost  to  divide  a  problem  of  size  n  into  smaller  subproblems  that 
will  be  recursively  solved,  and  to  combine  the  solutions  of  the  recursive  calls  to 
find  the  answer  to  the  whole  problem. 

Definition 2.  Let  F  be  a  d.a.c.  recursive  definition  of  a  function  F^.  We  say 
that  F  is  a  discrete  recursive  definition  if  it  follows  the  pattern 

Fn=tnF  ^  Fd,nFSd,n 

for  every  n  >  AT,  where  F>  >  1  is  the  (finite)  number  of  subproblems  to  be 

recursively  solved;  Rd,n  =  Wd-\-  rd,n  >  0  is  the  number  of  recursive  calls  to  deal 

with  the  d-th  subproblem,  where  >  0  is  the  asymptotic  number  of  calls  to  it 
and 

X  Kn\  =  0{n-n  (8) 

l<d<D 

for  some  />  >  0;  Sd,n  —  Zd  ■  n Sd,n  is  the  (integer)  size  of  the  d-th  subproblem 
to  be  recursively  solved,  where  0  <  Zd  <  I  and 

^  Jfid  =  0{n-^)  (9) 

l<d<D  ^ 


for  some  cr  >  0. 
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For  example,  (3)  is  a  discrete  recursive  definition.  Here  we  have  two  subprob¬ 
lems  to  recursively  deal  with  [D  —  2)  that  are  both  asymptotically  1/2  the  size 
of  the  whole  problem  (zi  =  .^2  =  —3/2  <  <  —1/2,  —1/2  <  S2,n  <  1/2) 

and  1/2  expected  calls  to  each  one  (wi  —  W2  =  1/2,  — 3/2n  <  ri,n  <  — l/2n, 
-l/2n  <  r2,n  <  l/2n),  where  for  the  bounds  of  Sn,k  and  rn,k  we  have  used  the 
fact  that  r  -  1  <  [rj  <  r  and  r  <  [r]  <  r  +  1  for  every  real  r.  Notice  than 
p  =  (j  =  I  is  a,  possible  choice  here. 

Theorems  (Discrete  Master  Theorem).  Let  T  be  a  discrete  recursive  def¬ 
inition  of  a  function  Fn,  and  let  Bn^Xn^n  be  the  main  term  of  in,  for  some 
constants  B,  a  and  c.  Let  us  define 

#(x)  =  ^  Wd-Zi’^, 

l<d<D 

and  let  %  ^  1  —  ^{a).  Then, 

1)  if'H  >  0  then  Fn  ~  tn/B,; 

2)  if  B  =  0  then 

2.1)  if  c>  —1  then  Fn  ~  tn  \nn/B' ,  where 

7/'~_(c+l)  Wd-zd^'lnzd] 

l<d<D 

2.2)  if  c  =  then  Fn  =  0[n'^  log^  n)  for  any  e  >  0,* 

2.3)  if  c  <  —\  then  Fn  —  0(n^); 

3)  if  B  <  0  then  Fn  ~  6>(n"),  where  a  is  the  unique  solution  of  0[a)  =  1. 


Some  Examples  of  the  Use  of  the  Discrete  Master  Theorem 

Let  us  solve  (3).  To  begin  with,  the  main  term  in  its  toll  function  is  n°log°n. 

Now  we  can  use  the  master  theorem  as  follows. 

1.  First,  we  identify  the  set  of  values  {wd}i<d<D  and  {zd}i<d<D-  This  yields 
wi  —  W2  =  1/2  and  zi  —  Z2  =  1/2.  We  should  make  sure  that  properties  (8) 
and  (9)  hold,  but  this  is  trivial  here  (floors  and  ceilings  are  never  a  problem). 

2.  We  define  ^[x)  —  {l/^Y  and  hence  B  —  l-  ^(0)  =  0. 

3.  Since  %  =  0  and  c  >  -1,  we  define  W  =  ~(0  -h  l)((l/2)° ln(l/2))  = 
—  ln(l/2)  =  In  2,  and  finally  Bn  ~  In  n/  In 2  —  log2  n. 

Let  us  consider  now  (5).  Assume  that  tn  =  6n^/ln^n. 

1.  wi  =  2,  W2  =  4:  and  =  I-  zi  =  1/3,  Z2  =  1/2  and  2:3  =  4/5.  (It  is  a  simple 
matter  to  check  that  this  recurrence  is  indeed  a  discrete  recursive  definition) . 

2.  0{x)  =  2(1/3)^  -f  4(1/2)^  4-  (4/5)^  and  hence  B  =  l-  ^(2)  =  -194/225. 

3.  Since  B  <  0,  Fn  =  0(n^),  where  a  is  the  unique  solution  of  #(a)  =  1,  which 
numerically  is  a  cz.  3.16756. 
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Finally,  let  us  set  in  —  for  the  recurrence 

Fn  —  in  F\njA\  •  (10) 

Notice  that  we  do  not  need  to  explicitly  state  the  values  of  Fn  at  small  indices, 
since  they  are  irrelevant  to  the  master  theorem.  Solving  it  is  very  easy. 

1.  wi  =  1  and  zi  =  1/4. 

2.  0(x)  =  (1/4)^  and  hence  'H  =  I  —^(2)  =  15/16. 

3.  Since  W  >  0,  ~  nV(15/16)  =  16nVl5. 

Note  that  the  last  example  above  follows  the  pattern  of  Equation  1,  and 
hence  we  can  analyse  it  through  the  old  master  theorem  (2),  which  becomes  a 
particular  case  of  the  new  one. 


3  The  Continuous  Master  Theorem 


This  section  covers  the  analysis  of  recursive  definitions  like  (4).  To  begin  with, 
let  us  define  the  concept  of  shape  function  (the  reason  for  this  name  will  be  clear 
after  Definition  5). 

Definition  4.  Let  u{z)  >  0  be  a  function  over  [0, 1]  such  that  u)'(z)  exists  and 
is  bounded  for  every  0  <  2:  <  1.  Furthermore,  let  cu(z)dz  be  greater  or  equal 
than  1.  Then  we  say  that  uj(z)  is  a  shape  function. 


Definitions.  Let  T  he  d.a.c.  recursive  definition  of  a  function  Fn-  We  say 
that  JF  is  a  continuous  recursive  definition  if  it  follows  the  pattern 

Fn=in-h  ^  i^n,kFk  (11) 

0<A:<n 

for  every  n>  N,  and  if  there  exists  some  shape  function  lo(z)  such  that 


E 

0<A:<n 


dz 


=  0(n-^) 


(12) 


for  some  p  >  0. 


Loosely  speaking,  the  last  definition  allows  us  to  use  the  integral  in  the  right 
of  the  expression  above  to  find  a  good  approximation  to  uJn^k-  For  instance, 
the  shape  function  for  (4)  is  ui(z)  =  2z  (notice  that  this  function  follows  the 
conditions  required  for  a  shape  function),  since 


2z  dz  =  z^ 


_  ^  ^  _  _1_ 

'  o  "4"  n  ^n.k  i  O  5 


and  hence  the  sum  of  errors  is  l/n  =  0{n  for  p  =  1. 
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Therefore,  u[z)  is  nothing  except  the  asymptotic  shape  of  the  distribution  of 
weights,  which  now  does  not  consist  in  a  finite  number  of  fixed  fractions  of  the 
original  size  of  the  problem  (as  it  was  in  previous  section),  but  is  very  similar  to 
a  continuous  probability  distribution,  where  the  area  beneath  the  function  is  the 
asymptotic  number  of  recursive  calls.  Recall  that,  by  definition,  /g  u{z)dz  >  1, 
and  therefore  we  are  assuming  that  there  is  at  least  one  asymptotic  recursive 
call.  This  condition  (very  likely  to  hold  in  practice)  simplifies  the  study  of  these 
recurrences. 

TheoremG  (Continuous  Master  Theorem).  Let  T  be  a  continuous  recur¬ 
sive  definition  of  a  function  Fn,  and  let  Bn^  In''  n  be  the  main  term  oftn,  where 
B,  a  and  c  are  constants.  Let  us  define 

(p(^x)  =  [  Lj(z)z^  dz, 

Jo 

and  let  LL  =  I  —  ^(a).  Then, 

1)  if%  >  0  then  Fn  ^  tn/Ji; 

2)  if  Ti  ~0  then 

2.1)  if  c>  -1  then  Fn  ^  tn\nn/7i',  where 

1)  J  uj{z)z^\nz  dz\ 

2.2)  ifc  =  -l  then  Fn  =  0{n^\og^n)  for  any  e  >  0; 

2.3)  ifc<-l  then  Fn  =  0{n°); 

3)  ifn  <  0  (including  the  case  %  =  -oo;  then  Fn  =  6>(n“),  where  a  is  the 
unique  solution  of  g>{oi)  =  1. 


Some  Examples  of  the  Use  of  the  Continuous  Master  Theorem 

Let  us  solve  the  recursive  definition 

(3o  =  0,  Qn  =  l+  ,  X]  (13) 

n[n+  L) 

related  to  the  number  of  comparisons  in  a  half-defined  search  in  a  quad-tree  [2]. 
Notice  that  the  main  term  in  the  toll  function  is  n°.  Hence, 

1.  First,  we  identify  the  shape  function  of  the  weights.  As  first  chance,  we  can 
try  the  following. 

(a)  From  ujn,k  we  compute  a  set  of  new  weights  crn,k,  by  replacing  terms  like 
n  -1-  1  or  (n  -  I  -  k)  by  n  or  (n  -  fc),  respectively.  This  yields  (Tn,k  = 
A(n  —  k)/n^. 

(b)  Now  we  have  to  check  that  -  <Tn,/c|  =  Oin-"^).  This  is  true  in  our 

example,  since  —  £Tn,fe I  =  4(n  — /?)/(n^(n  +  1))  =  (9(n  ^). 

(c)  We  compute  uj(z)  =  n  •  o'n,zn-  This  step  produces  an  expression  without 
n’s,  uj(z)  =  n  •  4(n  -  zn)fn^  =  4(1  -  z). 
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(d)  Finally,  we  should  prove  that  uj'[z)  exists  and  is  bounded,  and  also  that 
j^Lj{z)dz  >  1,  which  is  trivial  here. 

2.  We  define 


(p{x)  =  4 


dz  =  4. 


a?  +  1 


if  a;  >  -1  (and  i^[x)  =  +oo  otherwise).  Hence,  U-\  -  v?(0)  -  -1. 

3.  Since  <  0  we  have  that  Fn  -  9(n^),  where  a  is  defined  as  the  unique 

solution  of  ^(of)  :=  1,  which  yields  a  =  {y/l7  —  3)/2  c::;  0.56155  (this  is  the 
only  solution  to  +  3a  -  2  ~  0  that  is  greater  than  ~1). 


Let  us  analyse  (4). 


1.  We  already  know  that  uj(z)  =  2z. 

2.  We  define 


(p(x) 


'IT 

3 

^  ,  o 

-2  2*+^  dz  =  2 

Jo 

X  -{-  2_ 

2 

ic  -f-  2 


ii  X  >  —2  (and  (p(x)  —  +oo  otherwise).  Hence,  Ti  =  I  —  ^(1)  =  1/3. 

3.  Since  ?^  >  0,  ~  n/(l/3)  ~  3n. 


In  this  example  we  can  get  even  more  information,  by  means  of  a  simple  trick. 
Define  Gn  =  Sn  —  Sn,  Then 


2  _  6 

Gn  =  n  —  1  -i — r  kSk  -  3n  =  ~2n  —  1  H — 


.6— /  fi 

0<k<n  0<k<n 


E  +  ^  E 


0<A:<n 


It  is  well  known  that  J2o<k<n  “  "V2  +  n/6.  Therefore, 


G„  =  -4+1  +  4  E 


Q<k<n 


We  can  now  solve  this  recurrence  using  Theorem  6  again.  The  first  step  is  already 
done,  since  the  distribution  of  weights  remains  the  same,  as  is  the  case  for  (f{x). 
Computing  Ti  produces  U  ^  I  -  <^(0)  -  0.  Since  =  0  and  c>  -I,  we  define 


W  =  -(0  + 


F 

Z^l 

n 

/  2zz^  hxz  dz  =  —2 

1  z\nz  dz  =  —2 

—  In^-  — 

Jo 

lo 

[2 

4J 

J  0 


1 

2’ 


and  get  Gn  ~  — 41nn/(l/2)  =  —8 Inn.  We  can  make  one  more  step,  defining 
=  Gn  H-  8 Inn  for  n  >  0,  which  produces 


_  8 In n  1  4 Inn 

n  n  3n2 


+ 


E 


0<A:<n 


where  we  have  used  the  equality  Ylo<k<n  ^  In  /?  =  »  G(l). 

Now  we  get  Ti  =  I  —  <^(—1)  =  — 1  <  0,  and  hence  In  =  0(n°‘)  =  ^(1).  Notice 
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that  we  cannot  deduce  that  In  =  ^(l)^  since  the  toll  function  in  the  definition  of 
In  includes  positive  and  negative  terms  together,  and  their  contributions  could 
cancel  each  other.  As  a  final  conclusion  we  have  that  —  3n  ~  8 Inn  +  0{l). 

We  end  this  section  analysing  the  number  of  comparisons  while  sorting  an 
array  of  n  keys  through  the  variant  of  quicksort  that  uses  as  the  pivot  of  the 
partition  stage  the  median  of  a  random  sample  of  2/:  -f  1  keys,  for  some  fixed 
k  >0  {k  =  0  reduces  to  basic  quicksort).  This  method  was  suggested  by  Hoare 
himself  in  [4],  and  later  Van  Emden  [6]  analysed  it  by  means  of  information- 
theoretic  arguments  and  sensible  approximations.  We  can  now  prove  the  same 
results  as  a  simple  consequence  of  the  continuous  master  theorem. 

Let  be  the  number  of  expected  comparisons  while  using  quicksort  to  sort 
an  array  with  n  items  when  the  sample  has  2k  1  keys  at  each  stage  (except 
for  small  n).  The  recurrence  for  any  k  is 


Qri"^  ==n~l-|-<S’/c+2 


2k +  1 


(14) 


where  Sk  is  the  (linear  in  k,  but  constant  in  n)  number  of  comparisons  to  find 
the  median  of  the  sample.  Therefore, 


1.  (a)  We  compute 

2(2^ +1)! 

-  ^2k+i  •  ^!  *  k\ 

as  a  good  approximation  for  . 

(b)  It  is  routine  work  to  check  that 

(c)  We  compute  ujk(^)  ~  ^  •  (^n,ln  =  2(2A?  -f  1)1/^!^  •  z^(l  —  z)^ . 

(d)  Since  Uk{z)  is  a  polynomial  on  z,  we  have  that  Uk'{z)  exists  and  is 
bounded.  Furthermore,  we  know  that  the  asymptotic  number  of  recursive 
calls  is  2. 

2.  We  define 

^,{x)  =  =  dz, 

and  evaluate  ^{{k)  =  I  -  ipk{l).  For  this  step,  we  can  use  the  equality 


fz^{l-z) 

Jo 


^  dz  = 


Jo  ^  ^  (a  +  /?+l)! 

(see  [2],  page  479,  for  instance)  to  find  that,  as  expected,  ^{{k)  =  0. 

3.  Therefore,  we  define 

'}{'[k)  =  -J  u>k(z)z^  In  z  dz  ~  J  -^^■^^(1  -  :^)^lnz  dz. 


This  step  yields  W'(Ar)  =  1/(A?  +  2)  -|-  l/(^  +  3)  4-  . . .  +  l/{2k  +  2)  (see  [6], 
page  565).  Finally,  Qn^  --  nlnn/7{'{k). 
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4  The  Theorems 


In  this  section  we  present,  without  proof,  the  main  technical  results  from  which 
both  Theorem  3  and  Theorem  6  can  be  derived.  Most  of  them  refer  to  canonical 
recursive  definitions,  which  are  defined  as  follows. 


Definition?.  Let  be  a  d.a.c.  recursive  definition  of  a  function  Fn.  Let  Wn  = 
Zlo</c<n  is  a  canonical  recursive  definition  if  and  only  if 

both  these  properties  hold:  1)  It  exists  some  p  >  0  such  that  \  Wn  —  1|  =  0(n“^). 
2)  It  exists  some  upper  bound  U  <  1  such  that 


E 

0<k<n 


'^n,k 


for  n  large  enough. 

Intuitively,  the  first  condition  requires  that  the  total  number  of  recursive  calls 
to  solve  a  problem  of  size  n  tends  to  1  (with  a  minimum  convergence  speed). 
Notice  that,  opposed  to  the  old  master  theorem,  the  number  of  recursive  calls 
depends  on  n.  The  sum  in  the  second  condition  above  is  the  average  fraction  of 
the  original  problem  that  is  solved  by  a  recursive  call.  Therefore,  this  condition 
implies  that  the  problem  is  broken  into  pieces  that  are  (on  average)  a  fraction 
of  the  original  one. 

Let  .F  be  a  recursive  definition  of  As  an  immediate  consequence  of  (6)  we 
have  that  Fn  -  0{in)-  The  natural  question  that  arises  is:  Can  Fn  grow  faster 
than  tn  and,  if  so,  under  which  conditions?  For  the  recursive  definitions  we  deal 
with  and  roughly  speaking,  we  could  say  that  there  is  a  growing  order  associated 
to  every  distribution  of  weights,  irrespective  of  how  small  tn  is.  Let  us  call  it 
6>(n").  Then,  the  growing  order  of  Fn  should  be  Max{0(t„ ),  61(n")).  And  this 
is  almost  true. 

For  instance,  let  us  consider  (10).  We  will  see  in  a  moment  that  a  =  0  for  any 
canonical  recurrence,  such  as  this  one.  Therefore,  for  “big”  values  oft„,  such  as 
n,  or  2”  we  should  get  —  Max{0(^n)5  6>(1)}  =  0(tn),  which  is  true.  For 
“small”  values  of /n  like  1/n,  1/n^  or  2"^,  Fn  should  be  0(1),  which  is  also  true. 
However,  things  are  not  so  easy  for  values  of  tn  close  to  0(1).  For  example,  for 
tn  —  1,  Fn  turns  out  to  be  0(logn)  instead  of  0(1).  We  will  see  in  this  section 
how  to  cope  with  this  additional  factor. 

There  is  another  remark  about  the  results  in  this  section  that  we  should  make, 
namely  that  recursive  definitions  with  “small”  toll  function  (and  thus  inside  the 
zone  dominated  by  the  term  0(n^)  associated  to  the  distribution  of  weights)  are 
the  most  difficult  to  analyse.  Indeed,  there  is  no  way  to  find  the  lower  order  terms 
in  the  asymptotic  expression  of  Fn  nor  even  the  multiplicative  factor  of  the  main 
term  n",  but  to  consider  all  the  values  of  Fn,  the  values  at  small  indices  included. 
In  terms  of  a  recursion  tree  (see  [1],  for  example)  this  situation  corresponds  to 
the  case  in  which  the  solution  to  the  recurrence  is  dominated  by  the  values  at 
the  leaves.  Therefore,  a  method  for  the  study  of  recursive  definitions  based  only 
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in  the  asymptotic  properties  of  the  toll  function  and  the  distribution  of  weights 
(like  our  master  theorems)  cannot  be  used  to  get  the  multiplicative  factor  of  the 
main  term  in  Moreover,  for  some  recursive  definitions  that  factor  is  not 
asymptotically  constant. 

Next  theorems  formalize  one  of  the  claims  stated  above,  namely  that  any 
canonical  recursive  definition  is,  on  the  one  hand  Q{1)  (under  some  minimum 
additional  conditions),  and  on  the  other,  0{1)  for  “small”  in- 

Theorems.  Let  T  be  a  canonical  recursive  definition  of  a  function  Fni  such 
that  tn>0  and  J2o<k<N  =  0(n-P)  for  som.e  p  >  0.  Then  Fn  =  0[l). 

Notice  that,  apart  from  being  canonical,  T  must  follow  two  additional  con¬ 
ditions.  They  are  mainly  technical  properties  that  hold  in  most  cases.  Roughly 
speaking,  >  0  avoids  the  case  “everything  is  zero” ,  whilst  the  second  condition 
makes  the  values  of  Fn  at  n  <  N'  completely  irrelevant. 

Theorem 9.  Let  F  be  a  canonical  recursive  definition  of  a  function  Fn,  such 
that  in  =  Oifog^  n)  for  some  c  <  -1.  Then  Fn  =  0{l). 

Now  we  present  the  main  results  related  to  the  canonical  recursive  definitions 
that  are  dominated  by  the  toll  function.  In  contrast  to  the  last  recurrences, 
where  the  toll  function  lay  in  the  influence  zone  of  the  distribution  of  weights, 
recursive  definitions  whose  toll  function  is  big  enough  to  dominate  the  recurrence 
are  typically  easier  to  analyse,  and  in  most  cases  we  can  get  the  multiplicative 
factor  of  the  main  term  of  the  asymptotic  expression  of  Fn  Moreover,  as  we  have 
already  seen  in  Section  3,  sometimes  it  is  possible  to  get  several  of  the  main 
terms  of  the  solution,  their  multiplicative  factors  included. 


Theorem  10.  Let  F  be  a  canonical  recursive  definition  of  a  function  Fn,  and 
let  tn  =  where  a  >  0  and  Sn  is  a  strictly  positive  increasing  (eventually 

constant)  function  for  n  large  enough.  Then  Fn  —  0(tn)- 

Theorem  11.  Let  F  be  a  canonical  recursive  definition  of  a  function  Fn,  and  let 
tn  =  \n^n  -  Sn,  where  0-1  and  Sn  is  a  strictly  positive  increasing  (eventually 
constant)  function  for  n  large  enough.  Then  Fn  =  0(tn  logn). 

Notice  that,  according  to  the  old  master  theorem,  @(1)  seemed  to  be  the  thresh¬ 
old  value  for  in  above  which  Fn  became  a;(l).  Combining  last  theorem  with 
Theorem  9  allows  us  to  state  that  in  fact,  this  threshold  lies  close  to  =  1/  log  n. 
The  remaining  theorems  in  this  section  are  crucial  to  both  master  theorems. 


Theorem  12.  Let  F  be  a  canonical  recursive  definition  of  a  function  Fn,  and 
let  in  =  n^Sn,  where  a  >  0  and  Sn  is  a  strictly  positive  increasing  function  for  n 
large  enough.  Furthermore,  let 


7i  =  lim 

n-)-oo 


1  -  'Yh 

M<k<n 


exist  for  some  M .  Then  Fn  =  tn/TL  -|-  o{tn)- 
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Theorem  13.  Let  T  he  a  canonical  recursive  definition  of  a  function  Fn,  and 
let  tn  =  In^  n  •  <5;^,  where  0-1  and  Sn  is  a  strictly  positive  increasing  function 
for  n  large  enough.  Furthermore,  let 


exist  for  some  M .  Then  Fn  =  tn  Inn/H  o{tn  logn). 

Theorem  14.  Let  F  be  a  discrete  (continuous)  recursive  definition  of  a  function 
Fnf  and  let  a  be  the  unique  solution  of  the  equation  ^{a)  =  1  (‘p(oi)  =  Ij-  Let  B 
be  the  recursive  definition  that  we  get  after  the  substitution  Bn  —  Fnjn^ ■  Then 
B  is  a  canonical  discrete  (continuous)  recursive  definition. 

5  Final  Remarks 

We  have  shown  how  to  extract  useful  information  from  the  most  common  re¬ 
cursive  definitions,  exclusively  through  the  analysis  of  the  asymptotic  behaviour 
of  the  toll  function  and  distribution  of  weights.  We  have  only  given  restricted 
versions  of  the  master  theorems,  which  can  be  further  generalized.  For  instance, 
they  could  be  adapted  to  deal  with  toll  functions  that  include  sublogarithmical 
factors  (like  log  log  n).  On  the  other  hand,  the  definition  of  shape  function  is  also 
a  bit  restrictive,  but  it  is  enough  for  general  purposes. 
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Abstract.  The  notion  of  bisimulation  as  proposed  by  Larsen  and  Skou 
for  discrete  probabilistic  transition  systems  is  shown  to  coincide  with  a 
coalgebraic  definition  in  the  sense  of  Aczel  and  Mendler  in  terms  of  a  set 
functor.  This  coalgebraic  formulation  makes  it  possible  to  generalize  the 
concepts  to  a  continuous  setting  involving  Borel  probability  measures. 
Under  reasonable  conditions,  generalized  probabilistic  bisimilarity  can  be 
characterized  categorically.  Application  of  the  final  coalgebra  paradigm 
then  yields  an  internally  fully  abstract  semantical  domain  with  respect 
to  probabilistic  bisimulation. 

Keywords.  Bisimulation,  probabilistic  transition  system,  coalgebra,  ul¬ 
trametric  space,  Borel  measure,  final  coalgebra. 


1  Introduction 

For  discrete  probabilistic  transition  systems  the  notion  of  probabilistic  bisimi¬ 
larity  of  Larsen  and  Skou  [LS91]  is  regarded  as  the  basic  process  equivalence. 
The  definition  was  given  for  reactive  systems.  However,  Van  Glabbeek,  Smolka 
and  Steffen  showed  in  joint  work  with  Tofts  [GSS95],  that  for  a  concrete  process 
language  the  usual  notion  of  strong  bisimilarity  and  the  probabilistic  concepts  of 
reactive,  generative  and  so-called  stratified  bisimulation  constitute  a  hierarchy  of 
observational  congruences.  Several  other  probabilistic  equivalences  are  proposed 
as  well  in  the  literature.  However,  in  all  papers,  discrete  probability  distribu¬ 
tions  are  used,  and  hence  the  transition  systems  that  are  treated  are  in  essence 
of  a  finitely  branching  or  image-finite  nature.  The  recent  work  of  Blute  et  al. 
[BDEP97]  is  the  single  execption  that  we  know  of. 

For  the  exploration  of  probabilistic  transition  systems  and  stochastic  equiva¬ 
lences  in  the  setting  of  modeling  continuous  systems,  such  as  real-time  or  hybrid 
systems,  one  usually  wants  to  allow  more  general  probability  measures  than  the 
more  limited  discrete  probability  distributions.  [BDEP97]  use  stochastic  kernels 
and  spans  of  zigzags  to  underpin  their  notion  of  process  equivalence.  They  prove 
that  their  notion  of  bisimulation  agrees  in  the  discrete  case  with  the  Larsen- 
Skou  definition,  but  do  not  provide  a  characterization  of  bisimilarity  in  terms  of 
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transition  steps,  i.e.,  they  do  not  give  a  continuous  analogue  for  the  Larsen-Skou 
bisimulation. 

Here  we  attack  the  problem  of  continuous  probabilistic  transition  systems 
and  bisimulation  by  exploiting  the  transition-systems- as-coalgebras  paradigm. 
Using  a  minimal  amount  of  category  theory,  it  can  be  summarized  as  follows: 
Let  T\C  — )•  C  be  any  functor  on  a  category  C.  A  coalgebra  of  T  is  an  object  S  in 
C  together  with  an  arrow  a:  S  ^  For  many  categories  and  functors,  such 

a  pair  (5,  a)  represents  a  transition  system,  the  type  of  which  is  determined  by 
the  functor  T.  Vice  versa,  many  types  of  transition  systems  can  be  captured  by 
a  functor  this  way.  For  instance,  consider  the  familiar  labeled  transition  systems 
(5,  A,->),  consisting  of  a  set  S  of  states,  a  set  A  of  actions,  and  a  transition 
relation  C  S  x  A  x  S.  Put  C{X)  =  V{Ax  X),  the  collection  of  all  subsets  of 
AxX,  for  any  set  X,  and,  for  f:X  Y,  C{f):J0{X)  jC(F),  by  C{f){{{ai,Xi)  | 
i  €  I})  =  {{a^,  f{xi))  I  i  G  /}.  It  can  be  easily  shown  that  £  is  a  functor  on  the 
category  of  sets  and  functions.  A  labeled  transition  system  (5,  A,  — )•)  can  now 
be  represented  as  an  £-coalgebra  by  defining 

a:S  £(5),  s  {(a,  5')  |  (5,  a,  s')  €  ->}. 

Conversely,  any  £-coalgebra  corresponds  to  a  transition  system:  If  (5,  a)  is  a 
coalgebra  for  £,  then  (5,  A,  -)■),  with  C  S  x  A  x  S  given  by  (s,  a,  s')  e  iff 
(a,  s')  G  a(s),  is  clearly  a  transition  system.  (See  [Rut96]  for  more  details.) 

One  of  the  advantages  of  the  coalgebraic  view  on  transition  systems  is  the  ex¬ 
istence  of  a  general  definition  of  J^-bisimulation,  for  any  functor  T  (cf.  [AM89]). 
For  instance,  applying  that  definition  to  the  functor  £  above  yields  the  standard 
notion  of  strong  bisimulation.  In  general,  the  coalgebraic  theory  gives  a  generic 
approach  to  the  definition  and  description  of  bisimulation:  First  define  or  char¬ 
acterize  the  transition  systems  one  is  interested  in  as  coalgebras  of  a  suitably 
chosen  functor  T .  Then  obtain  a  definition  of  bisimulation  for  those  systems  by 
applying  the  categorical  definition  of  .F-bisimulation. 

The  coalgebraic  approach  is  applicable  to  many  kinds  of  transition  systems — 
see  [Rut96]  for  many  examples.  In  the  present  paper,  this  scheme  is  used  to  de¬ 
scribe  discrete  and  continuous  probabilistic  transition  systems  and  bisimulations. 
The  functor  M  \  assigns  to  a  metric  space  its  collection  of  Borel  probability  mea¬ 
sures.  It  is  shown  that  the  corresponding  notion  of  Afi -bisimulation  coincides, 
under  mild  conditions,  with  the  continuous  analogue  of  Larsen-Skou  bisimula¬ 
tion.  This  extends  a  similar  result  for  the  discrete  case,  which  is  in  fact  given 
first:  the  functor  X>,  which  assigns  to  a  set  the  collection  of  its  simple  probabil¬ 
ity  distributions,  is  shown  to  yield  a  categorical  characterization  of  Larsen-Skou 
bisimulation.  Hence,  in  agreement  with  general  opinion,  also  from  the  coalgebraic 
point  of  view  the  latter  equivalence  is  suggested  as  the  canonical  one. 

Another  appealing  aspect  of  the  coalgebraic  approach  is  a  canonical  way  of 
finding  internally  fully  abstract  domains  of  bisimulation,  where  two  elements  are 
equal  if  and  only  if  they  are  bisimilar.  It  follows  from  a  simple  but  very  gen¬ 
eral  argument  that  final  coalgebras  are  fully  abstract  (see  Aczel’s  final  coalgebra 
model  for  nonwellfounded  sets  [Acz88],  and  also  [RT93]).  We  shall  show  that 
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it  follows  from  general  coalgebraic  considerations  [AR89,Bar93,RT93]  that  both 
our  functors  V  and  M  i  have  a  final  coalgebra,  which  consequently  are  internally 
fully  abstract  with  respect  to  (discrete  and  continuous)  probabilistic  bisimula¬ 
tion.  Therefore  these  final  coalgebras  can  be  exploited  as  semantic  domains  for 
probabilistic  bisimulation  (an  important  direction  for  future  research). 

As  mentioned  above,  the  functor  Mi  is  defined  on  ultrametric  spaces,  and  the 
Borel  (T-algebras  and  associated  measures  are  taken  with  respect  to  the  metric 
topology.  Our  reasons  for  considering  metric  spaces  rather  than  the,  in  semantical 
contexts,  more  standard  use  of  ordered  structures,  as  studied,  e.g.,  by  Jones  and 
Plotkin  [JP89]  and  by  Edalat  [Eda94]  are  twofold.  Firstly,  one  can  resort  to 
the  rich  literature  for  standard  measure  theory  on  metric  spaces.  Secondly,  we 
can  apply  the  recently  developed  theory  on  coalgebraic  bisimulation  and  final 
coalgebras  in  the  metric  setting  [AM89,RT94].  Notably,  we  shall  see  that  is 
locally  contractive,  from  which  it  follows  that  it  has  a  final  coalgebra.  Because 
of  the  coalgebraic  definition  of  bisimulation,  we  thus  obtain  an  internally  fully 
abstract  domain.  Such  a  full  abstractness  result  has  been  lacking  so  far  in  the 
literature. 

In  conclusion,  P-bisimilarity  and  Larsen-Skou  bisimilarity  coincide  for  dis¬ 
crete  probabilistic  transition  systems.  For  the  continuous  case,  the  functor  Mi 
captures  the  generalization  of  probabilistic  transition  systems,  and,  under  condi¬ 
tions,  characterizes  the  associated  notion  of  probabilistic  bisimulation.  For  both 
functors  a  final  coalgebra  and  hence,  internally  fully  abstract  domain  exists, 
which  can  be  exploited  in  the  construction  of  domains  for  probabilistic  bisimu¬ 
lation  semantics. 
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2  Mathematical  Preliminaries 

Basic  measure  theoretic  definitions  (See,  e.g.,  the  standard  textbook  [Rud66].) 
A  (7-algebra  on  a  set  X  is  a  collection  of  subsets  which  contains  X  and 
is  closed  under  complement  and  countable  union.  Elements  E  X  are  called 
measurable  subsets  of  X.  Trivially,  the  powerset  'P{X)  is  a  cr-algebra  for  X.li  X 
is  a  topological  space,  the  Borel  (j-algebra  B{X)  is  defined  as  the  least  a- algebra 
containing  all  open  sets. 

A  function  fi:  X  [0,1],  where  X  is  a  u-algebra  on  a  set  X,  is  called  a 

X-probability  measure  if  fi{X)  =  1  and  //  is  a-additive,  i.e.,  Ek)  - 

IJ’iEi)  for  any  countable  disjoint  collection  of  measurable  sets  {Ei  |  i  G  /}. 
For  X  a  topological  space,  a  Borel  probability  measure  is  a  probability  mea¬ 
sure  on  X  taken  with  respect  to  the  Borel  (j-algebra  B{X).  For  x  e  X,  the 
Dirac-measure  Sx  is  given  by  Sx{E)  =  1  ii  x  e  E,  and  Sx{E)  =  0  otherwise. 
A  function  fi:X  ->■  [0,1]  is  called  a  simple  probability  distribution  if  there  ex- 
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ist  n  distinct  points  a:i, . . .  n  >  0,  such  that  ix{xi)  +  •  ■  ■  +  ^{xn)  =  1  and 
jj,[x)  —  0  for  :r  ^  {xi, . . .  ,x-n}-  denotes  the  collection  of  all  simple  prob¬ 

ability  distributions  on  X.  For  E  C  X,  fi[E]  is  short  for 

a  simple  probability  distribution  corresponds  to  a  convex  linear  combination  of 
Dirac-measures. 

Metric  spaces  (See,  e.g,,  the  monograph  [BV96].)  A  pair  (M,  d)  with  M  a 
nonempty  set  and  d:  — )•  [0, 1]  is  called  an  ultrametric  space  if,  for  all  x,  j/,  z  € 

M:  d{x,y)  =  d{y,x),  d(x^y)  =  0  x  =  y,  and  d{x,z)  <  max{d{x,y),d{y,  z)}. 
The  last  expression  is  referred  to  as  the  strong  triangle  inequality.  For  metric 
spaces  Ml ,  M2,  a  function  /:  Mi  M2  is  called  nonexpansive  if  d2{f{x),f{y))  < 
di(x,y),  for  all  x,y  e  M.  In  case  d2{f(x),  f{y))  <  K-di{x,y),  for  all  x,y  e  M,  the 
function  /  is  called  /^-contractive,  where  /c  is  a  constant  with  0  <  k<1.  The  col¬ 
lection  of  all  nonexpansive  mappings  from  Mi  to  M2  is  denoted  by  Mi  — >-i  M2. 
We  use  the  notation  O,  or  more  explicit  0{M),  for  the  collection  of  all  open 
subsets  of  M.  For  £  >  0  we  put  =  {  O  e  O  l^x  e  O:  Be{x)  C  O  }. 

Binary  relations  For  a  binary  relation  i?  C  5  x  T  we  use  tti  and  7r2  for  the 
projections  of  7?  on  5  and  T,  respectively.  R  is  called  total  if  the  two  projections 
TTi  and  7r2  are  surjective.  We  say  that  R  is  z-closed  if,  for  all  s,  s'  e  5,  t,  t'  e  T, 
R{s,t)  A  R(s',t)  A  R{s',t')  =>  R{s,t').  If  we  put,  for  n  e  Rq  =  R,  i?n+i  = 
{(s,  t')  G  SxT  I  3s'  G  S,t'  G  T :R{s,  t)ARn{s' ,  t)Ai?{s',  t')},  and  R*  =  IJneN 
we  have  that  R*  is  the  least  z-closed  binary  relation  on  5  x  T  containing  R.  Below 
we  will  employ,  for  s  G  5,  the  notation  F{s)  ~  {t  G  T  \  R{s,  t)}  and,  for  U  C  S, 
F[U]  -  likewise,  for  t  G  T,  E{t)  =  {s  G  5  |  i?(s,t)},  and, 

forV  CT,  E[V]^[}^^yE{t). 

Coalgebras  (See,  e.g.,  [Rut96].)  Let  C  be  either  the  category  of  sets  and  functions, 
or  the  category  of  ultrametric  spaces  and  nonexpansive  mappings.  (These  are 
the  only  categories  playing  a  role  in  this  paper.)  Let  E:C  C  be  a  functor. 
An  J^-coalgebra  is  a  pair  (5,  a)  consisting  of  an  object  5  in  C  together  with  an 
arrow  a:  S  ^  E{S)  in  C  called  a  coalgebra  structure  on  S.  A  homomorphism 
between  two  J^-coalgebras  (S', a)  and  {T,/3)  is  an  arrow  f:S  T  inC  such  that 
T{f).a  =  Pof. 

An  E -bisimulation  between  two  J^-coalgebras  (5,  a)  and  {T,p)  is  a  relation 
i?  C  5  X  T  for  which  there  exists  a  coalgebra  structure  7:  7?  ->■  E{R)  such  that 
the  projections  tti  :  7?  ->  and  7r2:  77  -A  T  are  homomorphisms:  E{7ri)<>a  =  7°7ri 
and  .7^(7r2)  °  p  =  'y  712-  We  then  say  that  R  is  an  JF-bisimulation  for  a  and  p. 
The  arrow  7  is  called  mediating  for  a  and  p.  We  write  x  ^  y  {^x  and  y  are 
bisimilar’)  whenever  there  exists  an  JF-bisimulation  R  with  (x,  y)  G  R. 

An  JF-coalgebra  (79,(5)  is  called  final  if  there  exists  for  any  .F-coalgebra  (5,  a)  a 
unique  homomorphism  from  (5,  a)  to  {D,S).  We  have  the  following  result. 

Theorem  1.  (Internal  full  abstractness)  For  a  final  E-coalgebra  (79,(5)  and 
x,yGD,x  =  y  if  and  only  if  x  ~  y. 
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The  proof  is  easy,  see,  e.g.,  [Rut96] ,  Theorem  9.2.  The  main  difficulty  in  obtaining 
full  abstractness  lies  in  the  construction  of  a  final  coalgebra,  which  in  general  is 
nontrivial. 


3  A  coalgebraic  interpretation  of  Larsen-Skou 

bisimulation 

Starting  from  the  definitions  of  a  discrete  probabilistic  transition  system  and 
probabilistic  bisimulation  as  proposed  in  the  literature,  we  will  consider  generab 
izations  of  (discrete)  probabilistic  transition  systems  as  coalgebras  of  a  functor  V 
on  Set.  We  argue  that  P-bisimilarity  implies  probabilistic  bisimilarity,  and,  us¬ 
ing  the  notion  of  z-closure,  that  probabilistic  bisimulation  and  totality  imply 
P-bisimilarity.  Then  it  is  shown  how  this  leads  to  the  existence  of  a  fully  ab- 
stract  domain. 

Definition  2.  [LS91,GSS95]  A  discrete  probabilistic  transition  system  is  a  tuple 
(Pr,  Act,/r)  where  Pr  is  a  given  set  of  processes,  Act  is  a  given  set  actions,  and 
H:  Pr  X  Act  x  Pr  [0, 1]  is  a  so-called  transition  probability  function,  i.e.,  for 
all  P  G  Pr,  a  €  Act,  /i(P,  a,  •)  is  either  the  zero-map  or  a  simple  probability 
distribution. 

A  probabilistic  bisimulation  for  a  discrete  probabilistic  transition  system  is 
an  equivalence  on  Pr  such  that 

P  =  Q  Sp'eF  ~  YIp'&e  p(Q5  ^  ) 

for  all  P,Q  €  Pr,  a  €  Act,  and  equivalence  classes  E  G  Pr/=.  (Using  the 
conventions  of  Section  2,  the  implication  can  also  be  written  as  P  =  Q  ^ 
fx[P,a,E]  =  fj,[Q,a,E].)  Two  processes  P  and  Q  are  said  to  be  probabilistic 
bisimilar  if  some  probabilistic  bisimulation  contains  the  pair  (P,Q). 

Above  we  introduced  the  notation  V{S)  for  the  collection  of  all  simple  probability 
distributions  over  a  set  5.  In  fact,  V  can  be  extended  to  a  Set-functor  by  defining 
for  a  mapping  /:  S  T  a  function  V{f):T){S)  P(T)  which  maps  a  simple 
distribution  /i  on  S'  to  a  simple  distribution  I>(/)(/i)  on  T  such  that  P(/)(/i)(t)  - 

Let  0  represent  termination.  Note  that  a  probabilistic  transition  system  is 
just  a  mapping  }i:  Pr  x  Act  P(Pr)  +  {0}  or,  equivalently,  a  function  fi:  Pr  ^ 
(Act  ->  (P(Pr)  +  {0})).  In  other  words,  a  probabilistic  transition  system  is 
precisely  a  coalgebra  of  the  functor  Act  ->  (P(’)  +  {0})._  Applying  the  category 
theoretical  machinery  as  described  in  Section  2  now  gives  us  the  coalgebraic 
notion  of  bisimulation.  We  will  show  that  it  corresponds  to  (actually  generalizes) 
the  notion  of  probabilistic  bisimulation  of  Definition  2,  thus  providing  categorical 
evidence  for  the  Larsen-Skou  bisimulation  as  the  canonical  process  equivalence 
for  discrete  probabilistic  transition  systems. 

For  clarity  of  presentation  we  suppress,  for  the  moment,  the  action  compo¬ 
nent  of  a  probabilistic  transition  system,  and  also  do  not  bother  about  termi¬ 
nation.  Thus  we  consider  coalgebras  of  the  functor  V  itself.  As  it  turns  out,  the 
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presence  of  labels  and  termination  does  not  make  any  essential  difference  for 
the  technical  content  of  what  follows.  Before  we  relate  probabilistic  bisimulation 
with  P-bisimulation,  we  first  give  a  generalization  of  Definition  2,  by  allow¬ 
ing  bisimulations  between  different  transition  systems,  which  are  not  necessarily 
equivalence  relations. 

Definition  3.  Let  a:  5  -)•  V{S),  l3:T  ^  V{T)  be  two  (stripped)  discrete  prob¬ 
abilistic  transition  systems.  A  binary  relation  RC  S  xT  is  called  a  probabilistic 
bisimulation  for  a,/5  iff  R{s,t)  ^  a(s)[[/]  =  P{i)[y]^  for  sl\  s  e  S,t  e  T  and 
U  C  S,V  C  T  such  that  Two  elements  s  €  5,  t  G  T  are 

said  to  be  probabilistic  bisimilar  if  some  probabilistic  bisimulation  contains  the 
pair  {s,t). 

Note  that  if  R  is  an  equivalence  relation,  then  7rf^(C/)  =  7r^^(y)  if  and  only  if 
U  ~  IJ Ei~V,  for  some  collection  of  equivalence  classes  {Ei\i  G  /}  of  R.  Thus 
in  this  case,  the  condition  on  U  and  V  in  Definition  3  amounts  to  the  assumption 
of  E  being  an  equivalence  class  in  Definition  2,  or,  following  the  terminology 
of  [Hen95],  U  and  V  are  the  same  ‘= ’-block.  This  shows  that  Definition  2  is  a 
special  instance  of  Definition  3  (‘modulo’  the  presence  of  labels  and  termination). 
By  exploitation  of  the  various  definitions  one  straightforwardly  verifies  that 
X>-bisimulation  implies  probabilistic  bisimulation. 

Lemma  4.  Let  a:  5  -4  D(5)  and  /?;  T  -4  V{T)  be  two  discrete  probabilistic 
transition  systems.  Let  R  be  a  T> -bisimulation  for  Then  R  is  a  probabilistic 
bisimulation  for  a ^(3. 

The  reverse  of  the  above  lemma  is  more  intricate.  We  will  first  use  the  concept 
of  z-closure  and  associated  properties  as  developed  in  Section  2. 

Lemma  5.  If  R  C  S xT  is  a  probabilistic  bisimulation  for  a:  5  -4  V{S),  j3:T  ^ 
V{T),  then  so  is  R* ,  the  z-closure  of  R. 

So,  if  s  G  5  and  t  eT  are  probabilistic  bisimilar,  we  can  assume  —without  loss 
of  generality —  that  there  exists  a  z-closed  probabilistic  bisimulation  contain¬ 
ing  (s,t).  We  will  need,  for  technical  reasons,  that  R  is  total.  This  is  equivalent 
with  the  common  assumption  of  transition  systems  to  have  a  distinguished  initial 
state  and  considering  reachable  states  only. 

Theorem  6.  Let  R  C  S  x  T  be  a  probabilistic  bisimulation  for  a:  S  ^  T){S) 
and  f3:T  V{T).  Moreover,  assume  R  to  be  z-closed  and  total.  Then  R  is  a 
V -bisimulation. 


Proof.  The  mapping  y:R-^  ^(^)  given  by 


7(s,t)(s',f') 


0 

a{s)(s')  ■ 

miFis')] 


i{mms')]  =  o 

otherwise. 


for  (s,t)  G  R,  is  mediating  for  a  and  /?. 


n 
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The  format  of  the  definition  of  j{s,t)  is  reminiscent  of  the  discrete  probability 
distributions  of  [JL91].  It  is  however  not  clear  how  their  notion  of  probabilistic 
specification  extends  to  the  continuous  setting  of  Section  4, 

It  is  straightforward  to  adapt  the  above  line  of  reasoning  to  a  functor  V' 
given  by  V  =  Act  (P(-)  +  {0}).  The  discrete  probabilistic  transition  systems 
of  Definition  2  are  in  1-1  correspondence  with  the  coalgebras  of  this  functor,  and 
the  notion  of  P'-bisimulation  coincides  with  that  of  probabilistic  bisimulation  of 
Definition  2  (for  total  relations  R). 

We  can  now  benefit  from  some  general  insights  in  the  theory  of  coalgebras, 
by  applying  (a  minor  variation  on)  a  result  from  [Bar93]  involving  boundedness 
of  a  set  functor. 

Theorem  7.  The  functor  V  (and  also  V)  has  a  final  coalgebra. 

The  final  coalgebra  for  V  is  nontrivial.  The  final  coalgebra  for  V,  though,  is 
degenerate:  it  equals  the  one  element  set.  This  is  equivalent  to  the  fact  that, 
due  to  the  absence  of  labels  and  a  concept  of  termination  as  present  for  V' ,  all 
elements  in  any  two  D-coalgebras  are  probablisitically  bisimilar. 

Let  P  be  the  final  P'-coalgebra,  so  P  =  Act  (T>(P)  +  {0})-  (Note  that 
final  coalgebras  are  always  fixed  points.  See,  e.g.,  [Rut96],  Theorem  9.1.)  The 
following  is  immediate  by  Theorem  1. 

Corollary  8.  The  system  F  is  internally  fully  abstract  with  respect  to  the  orig¬ 
inal  notion  of  probabilistic  bisimulation  of  Definition  2. 

4  Ali“Bisimilarity  for  Probabilistic  Transition  Systems 

The  previous  section  illustrates  that  in  a  discrete  probabilistic  setting,  a  coalge- 
braic  interpretation  of  probabilistic  transition  systems  and  bisimulation  can  be 
given,  which  is  equivalent  with  the  usual  ‘direct’  approach.  One  of  the  advantages 
of  the  abstract  coalgebraic  approach  is  that  it  can  fairly  easily  be  generalized  to 
the  continuous  setting  of  stochastic  systems.  We  will  now,  in  fact,  allow  proba¬ 
bility  measures  to  play  the  role  of  the  simple  distributions  in  the  definition  of  a 
probabilistic  transition  system. 

Probability  measures  only  make  sense  in  the  context  of  a  cr-algebra.  When  the 
collection  of  processes  comes  equipped  with  a  topology  — as  is  the  case  if  the  set 
of  processes  is  endowed  with  an  order  or  a  metric  structure—  the  obvious  choice 
for  this  cr-algebra  is  the  Borel  cj-algebra,  i.e.  the  least  a-algebra  containing  all 
the  open  sets.  As  mentioned  in  the  introduction,  we  prefer  the  use  of  ultrametric 
(cf.  [BV96])  above  order,  because  of  a  combination  of  the  following  two  reasons: 
(1)  the  technical  advantage  of  a  close  relationship  between  standard  measure 
theory  and  metric  topology,  and  (2)  the  availability  of  a  final  coalgebra  theorem 
in  the  metric  setting,  leading  to  a  fully  abstract  domain  for  general  probabilistic 
bisimulation. 

The  generalization  of  the  notion  of  a  discrete  probabilistic  transition  system 
and  the  associated  concept  of  bisimulation  as  proposed  by  Larsen  and  Skou  is 
as  follows. 
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Definition  9.  A  (general)  probabilistic  transition  system  is  a  tuple  {Pr,Act,fi) 
where  Pr  is  a  given  ultrametric  space  of  processes,  Act  is  a  given  set  of  actions, 
and  /i:  Pr  x  Act  x  B{Pr)  [0, 1]  is  a  so-called  (general)  transition  probability 
function,  i.e.,  /i(P,  a,  •)  is  either  the  zero-map,  or  a  Borel  probability  measure, 
for  all  P  e  Pr,  a  e  Act.  (Here  B{Pr)  denotes  the  collection  of  Borel  measurable 
subsets  of  Pr.) 

A  probabilistic  bisimulation  for  a  probabilistic  transition  system  (Pr,  Act,  /i) 
is  an  equivalence  on  Pr  such  that  every  equivalence  class  P  C  Pr  of  '  =  ’  is 
measurable,  and 

P  =  Q  ^  n{P,a,E)  =  fi{Q,a,E) 

for  all  P,Q  ^  Pr,  a  e  Act,  and  E  e  Pr/=.  Two  processes  P  and  Q  in  Pr  are  said 
to  be  probabilistic  bisimilar  if  there  exists  a  probabilistic  bisimulation  containing 
the  pair  (P,  Q). 

Note  that  the  equivalence  classes  E  of  must  be  measurable,  since  only  then 
the  values  fi{P,a,  E),  fi{Q,a,  E)  are  well-defined. 

For  reasons  of  presentation,  we  dispense  with  the  actions  and  with  the  treat¬ 
ment  of  termination.  They  can  be  added  again  later.  In  this  way,  a  probabilistic 
transition  system  becomes  a  function  a:  5  — >  Adi  (5)  where  Adi (5)  denotes  the 
collection  of  all  Borel  probability  measures.  In  the  reformulation  of  the  related 
notion  of  probabilistic  bisimulation  we  give,  as  before,  first  a  slightly  more  gen¬ 
eral  definition  of  bisimilarity  of  systems  with  different  carriers. 

Definition  10.  Let  a:  S  ^  Adi (5)  and  (3:T  ^  Mi{T)  be  two  probabilistic 
transition  systems.  A  relation  R  C  S  x  T  is  called  a  probabilistic  bisimulation 
for  a,l3  iff  R{s,t)  a{s)(U)  =  P(t){V)  for  all  s  e  S,  t  £  T  and  U  G  B(S), 
V  e  B{T)  such  that  ^(P)  -  Two  elements  s  G  P,  f  G  T  are  said  to  be 

probabilistic  bisimilar  iff  some  probabilistic  bisimulation  contains  the  pair  (s,f). 

As  for  V  in  the  previous  section,  Adi  can  be  regarded  as  a  functor,  viz.  a  functor 
on  the  category  UMS  of  ultrametric  spaces  and  nonexpansive  mappings. 

Definition  11.  The  functor  Adi:  UMS  UMS  is  given  as  follows:  Adi(M)  is 
the  collection  of  all  Borel  probability  measures  endowed  with  the  metric  d  such 
thdit  d(ii,iy)  <  e  4=^  VO  G  /i(0)  =  i/(0),  for  all /i,  i/ G  Adi  (M),  e  >  0.  For 
nonexpansive  f:M  ^  N  the  mapping  Adi  (/):  Adi(M)  — >•  A4i{N)  is  defined  by 

for  all  U  G  P(iV). 

Elementary  considerations  concerning  Borel-cr-algebras  and  nonexpansive  maps 
show  that  Adi  is  a  well-defined  functor  on  UMS.  Following  the  coalgebraic 
paradigm,  Adi  induces  a  notion  of  Adi -bisimulation.  One  half  of  the  relationship 
of  Adi -bisimulation  and  probabilistic  bisimulation  can  be  shown  directly. 

Lemma  12.  Let  a:  S  Adi  (5),  /3:T  Adi(r)  be  two  probabilistic  transition 
systems.  Any  A4i -bisimulation  R  for  a  and  fd  is  also  a  probabilistic  bisimulation 
for  a,  id. 
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Below  we  show  that  the  reverse  also  holds  under  reasonable  conditions.  The  tech¬ 
nicality  to  be  dealt  with  concerns  the  proper  generalization  of  the  measurability 
condition  of  the  equivalence  classes  E. 

For  a  probabilistic  bisimulation  ‘=’  in  the  sense  of  Definition  9  we  have,  by  an 
elementary  set-theoretic  argument,  a  partitioning  into  squares  of  subsets.  More¬ 
over,  these  subsets  are  measurable  by  assumption.  So,  we  have  =  =  jJiG/ 
Similarly,  for  the  general  set-up,  we  want  a  decomposition  R  =  UjtGi^r  ^ 
where  the  Ek  and  Fk  are  Bor  el  sets  in  S  and  T,  respectively.  Additionally, 
for  measure  theoretical  considerations,  we  will  assume  the  number  of  rectan¬ 
gles  Ek  X  Fk  that  constitute  R  to  be  countable. 

Definition  13.  A  binary  relation  R  C  S  xT  on  two  ultrametric  spaces  S  and  T 
is  said  to  have  a  Borel  decomposition  iff  i?  ==  UkeK  ^  where  {Ek  \  k  £  K}, 
{  Fk  \  k  £  K  }  are  countable  partitions  of  Borel  sets  of  5  and  T,  respectively. 

In  the  construction  of  a  mediating  probabilistic  transition  system  'j:  R Mi  (i?), 
for  a  given  probabilistic  bisimulation  i?,  we  can  again  assume  that  R  is  z-closed. 
Since  no  measure  theoretical  considerations  are  involved,  the  proof  of  this  is 
literally  as  for  Lemma  5.  The  property  is  used  in  the  next  result. 


Theorem  14.  Let  a:  S  Mi{S),  /3:T  ^  Mi{T)  be  two  probabilistic  tran¬ 
sition  systems.  Let  R  be  a  probabilistic  bisimulation  for  a,  13  in  the  sense  of 
Definition  10.  Assume  that  R  is  z-closed.  If  R  has  a  Borel  decomposition,  then 
R  is  an  Mi -bisimulation  for  a,  (3. 

Proof.  Let  {  Ek  x.  Fk  \  k  e  K  }  he  a  Borel  decomposition  of  R.  Suppose  R{s,t) 
holds.  The  mapping  7(5,  t):  B{R)  ->  [0, 1]  is  then  given  by 


j{s,t){{UxV)nR)  =  EkeK 


a{s){UnEk)-l3{t)iy^Fk) 

miFk) 


(4.1) 


for  U  G  B{S),  V  e  B{T).  The  verification  that  7(5,  t)  is  well-defined  and  medi¬ 
ating  for  a(s),  p{t)  is  nontrivial  but  omitted  for  reasons  of  space.  □ 


In  the  remainder  of  this  section,  we  shall  again  use  some  general  insights  from 
the  theory  of  coalgebras,  this  time  by  applying  a  result  from  [AR89,RT93]. 

In  turns  out,  that  we  are  only  able  to  show  the  existence  of  a  final  coalgebra 
when  we  consider  an  adaptation  oi Mi,  say  M'l ,  which  delivers  Borel  probability 
measures  with  so-called  compact  support,  i.e.,  measures  that  vanish  outside  a 
compact  set.  More  precisely,  for  a  metric  space  M,  pL’.  B(M)  [0, 1]  is  said  to 
have  a  compact  support  if,  for  some  compact  subset  K  C  M,  we  have  that 
U  n  K  —  0  =>  ii{U)  —  0,  for  all  U  G  B{M).  Let  M\{M)  denote  the  collection  of 
all  Borel  probability  measures  of  an  ultrametric  space  M.  Similarly  as  for  Mi, 
the  new  M'l  extends  to  a  functor  on  UMS. 

Additionally,  to  ensure  the  property  of  local  contractivity  (see,  e.g.,  [RT93]), 
we  put  in  a  scaling  functor  72 .  This  operation  is  harmless  from  a  semantical  point 
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of  view.  The  usage  oi  though,  does  narrow  the  type  of  transition  systems 
falling  within  the  framework.  However,  we  stress  that  the  established  relationship 
of  coalgebraic  and  probabilistic  bisimulation,  still  carry  through  for  the  modified 
setting.  Additionally,  for  the  class  of  transition  systems,  now  captured  by  the 
functor  Act  ->  {M[{-)/2  +  {0}),  the  existence  of  a  final  coalgebra  is  guaranteed. 

Theorem  15.  Let  the  functor  j^:UMS  UMS  be  given  by  T  —  Act  — > 
(■^i(')/2  +  {0})-  Then  the  following  holds: 

(a)  T  is  locally  contractive,  i.e.,  for  some  /c,  0  <  k  <  1,  and  all  ultrametric 
spaces  M  and  N,  the  function  Tm,n'-  ->-i  N)  {^(Af)  -^i  T^{N))  given 
by  ^M,  ^(/)  zz  !F{f)  is  n- contractive. 

(b)  If  M  is  complete,  then  T{M)  is  complete. 

(c)  The  functor  T  has  a  final  coalgebra. 

The  presence  of  '-/T  in  the  definition  of  T  results  in  (a).  (The  other  constituent 
functors  are  locally  nonexpansive.)  Only  for  part  (b)  the  assumption  of  measures 
having  a  compact  support  is  necessary.  Its  proof  is  non-trivial.  Finally,  part  (c) 
follows  from  (a),  (b),  and  (a  minor  variation  of)  [RT93],  Theorem  4.8. 

Let  Q  be  the  final  J^-coalgebra:  Q  =  Act  {M[{Q)/2  +  {0}).  From  Theorem  1 
and  15  we  then  immediately  obtain  the  following  result. 

Corollary  16.  The  system  Q  is  internally  fully  abstract  with  respect  to  proba¬ 
bilistic  bisimulation. 


5  Conclusion  and  future  research 

In  this  paper,  a  framework  is  proposed  for  probabilistic  transition  systems,  in¬ 
volving  general  probability  measures,  and  an  associated  notion  of  probabilistic 
bisimulation.  Most  research  reported  in  the  literature  so  far  deals  with  discrete 
probabilistic  transition  systems,  employing  simple  probability  distributions  only. 
The  use  of  Borel  measures  allows  for  an  extension  of  this  to  a  continuous  set¬ 
ting,  which  is  necessary  for  the  further  development  of  models  for  dynamical, 
real-time,  and  in  particular  hybrid  systems,  for  which  discreteness  and  image- 
finiteness  are  often  too  restrictive. 

Following  the  transition-systems-as-coalgebras  paradigm,  the  categorical  set¬ 
up  provides  a  characterization  of  the  Larsen-Skou  bisimulation  in  terms  of  a 
set  functor.  For  the  continuous  case,  a  similar  result  is  shown  for  a  functor  on 
the  category  of  ultrametric  spaces.  Moreover,  exploiting  parts  of  the  theory  of 
coalgebras,  both  for  the  discrete  case  and  for  the  continuous  case,  internally  fully 
abstract  domains  are  constructed. 

Further  investigations  of  the  proposed  notion  of  Borel  decomposition  should 
clarify  how  the  latter  relates  to  the  use  of  Polish  spaces  as  in  [BDEP97].  We  ex¬ 
pect  that  the  technical  result  obtained  there,  on  the  existence  of  weak  pullbacks, 
applies  also  to  our  setting.  Also,  once  a  suitable  continuous  process  language  is 


identified  (such  as  PCCS  [GJS90]  for  the  discrete  case),  the  process  equivalences 

and  fully  abstract  domains  presented  in  this  paper  may  be  fruitfully  applied  in 

the  semantical  study  of  dynamical  and  hybrid  systems. 
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Distributed  Processes  and  Location  Failures 

(Extended  Abstract) 
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Abstract 

Site  failure  is  an  essential  aspect  of  distributed  systems;  nonetheless  its  effect 
on  programming  language  semantics  remains  poorly  understood.  To  model  such 
systems,  we  define  a  process  calculus  in  which  processes  are  run  at  distributed  lo¬ 
cations.  The  language  provides  operators  to  kill  locations,  to  test  the  status  (dead 
or  alive)  of  locations,  and  to  spawn  processes  at  remote  locations.  Using  a  variation 
of  bisimulation,  we  provide  alternative  characterizations  of  strong  and  weak  barbed 
congruence  for  this  language,  based  on  an  operational  semantics  that  uses  configu¬ 
rations  to  record  the  status  of  locations.  We  then  derive  a  second,  symbolic  char¬ 
acterization  in  which  configurations  are  replaced  by  logical  formulae.  In  the  strong 
case  the  formulae  come  from  a  standard  propositional  logic,  while  in  the  weak  case 
a  temporal  logic  with  past  time  modalities  is  required.  The  symbolic  characteri¬ 
zation  establishes  that,  in  principle,  barbed  congmence  for  such  languages  can  be 
checked  efficiently  using  existing  techniques. 


1  Introduction 

Many  semantic  theories  have  been  proposed  for  concurrent  processes  [18,  16,  6].  Al¬ 
though  these  theories  have  been  fruitfully  applied  to  the  analysis  of  some  distributed 
systems,  for  the  most  part  they  ignore  an  essential  feature  of  such  systems,  namely  their 
distribution. 

As  a  simple  example  consider  two  implementations  of  a  client-server  application 
in  which  the  client  can  demand  an  interactive  service  provided  by  the  server,  such  as 
previewing  or  updating  a  document.  In  one  implementation  (System  A)  the  server 
spawns  a  process  to  handle  the  document  at  its  own  site,  the  remote  location,  and  the 
client  previews  the  document  remotely.  In  the  other  (System  B)  the  server  sends  a 
process,  including  the  document,  to  the  client  site,  and  the  client  previews  the  document 
locally.  Using  the  semantic  theories  mentioned  above  it  would  be  difficult  to  distinguish 
between  these  implementations,  as  the  only  difference  between  them  is  the  location  at 
which  activity  occurs.  We  aim  to  develop  a  useful  extensional  theory  of  systems  which 
would  take  this  type  of  property  into  account. 

*  Research  funded  by  EPSRC  project  GR/K6070I .  Authors’  address:  School  of  Cognitive  and  Computing 
Sciences,  Univ.  of  Sussex,  Palmer,  Brighton,  BNl  9QH,  UK,  {jamesri  ,matthewh}@cogs.susx.ac  .uk 
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gestions  in  the  early  stages  of  this  work. 
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In  [8,  20,  10]  such  theories  have  been  proposed.  All  of  these  theories,  however, 
are  based  on  a  very  strong  assumption:  that  an  observer,  or  user,  can  determine  the 
location  at  which  every  action  is  performed.  Here  we  start  from  a  weaker  premise:  that 
in  distributed  systems  sites  are  liable  to  failure.  The  model  of  failure  we  have  adopted 
is  di  fail  stop  model  in  which  failures  are  independent  of  each  other  and  the  number  of 
failures  that  can  occur  is  unbounded.  Assuming  that  sites  can  fail,  it  is  easy  to  see  that 
Systems  A  and  B,  outlined  above,  are  indeed  different:  if,  after  the  client  has  begun 
interaction  with  the  document,  a  failure  occurs  at  the  remote  site,  then  in  System  A  the 
client  deadlocks,  while  in  System  B  it  can  continue  operation  unaffected. 

Our  work  is  motivated  by  the  papers  [2,  12].  In  these  papers,  distributed  languages 
with  location  failures  are  defined  and  shown  to  be  very  expressive.  In  both  of  these 
papers,  the  semantics  is  based  on  barbed  equivalence,  which  requires  quantification 
over  all  program  contexts  and  thus  is  difficult  to  use  directly.  In  each  of  the  cited  works, 
the  authors  provide  a  translation  from  their  language  into  a  simpler  (non-distributed) 
language  and  prove  that  the  translations  are  adequate  or  fully  abstract  in  some  sense. 
While  these  translations  provide  theoretical  results  about  the  relative  expressiveness  of 
distributed  and  interleaving  calculi,  they  are  sufficiently  complicated  to  make  reasoning 
about  examples,  even  simple  ones,  very  difficult. 

By  restricting  attention  to  an  asynchronous  language,  Amadio  [4]  has  recently  im¬ 
proved  on  the  results  of  [2],  providing  simpler  translations.  Although  our  work  devel¬ 
oped  independently  of  [4],  the  language  we  study  has  much  in  common  with  the  lan¬ 
guage  developed  there.  The  main  difference  is  that  our  language  has  no  value-passing, 
allowing  us  to  concentrate  on  the  effects  of  location  failure  and  simplifying  the  state¬ 
ment  of  many  of  our  results.  Since  the  issues  raised  by  failures  and  value  passing  are 
largely  independent,  this  paper  may  be  seen  as  providing  two  extensional  views  of  a 
language  similar  to  Amadio’s;  the  first  of  these  is  concrete,  as  is  his  translation,  the 
second  is  more  abstract. 

In  Section  2,  we  consider  a  simple  language  for  located  processes  based  on  pure 
CCS  [18],  with  which  we  assume  familiarity.  For  example  {a.p)^  is  a  process  located  at 
f  which,  if  £  is  alive,  may  perform  the  action  a  and  then  behave  as  (p)^.  In  addition  to  the 
usual  operators  of  CCS  we  have  the  following  new  operators:  spawn (^,p)  which  starts 
process  p  running  at  location  £\  ki  1 1  which,  if  location  £  is  alive,  kills  £  (with  the  result 
that  any  process  located  at  £  is  deactivated)  and  then  behaves  as  p;  and  if  £  then  p  else  q 
which  silently  evolves  to  either  p  or  q,  depending  on  whether  £  is  alive  or  dead  when 
the  test  is  performed. 

We  give  an  operational  semantics  for  this  language  in  terms  of  a  labelled  tran¬ 
sition  system.  The  judgments  depend  on  a  set  L,  of  live  locations,  and  are  of  the 
form  L\>P  A  where  P  and  P'  are  located  processes  and  a  is  either  a  vis¬ 

ible  action,  which  permits  synchronization,  or  the  internal  action  T.  To  decide  on 
an  appropriate  equivalence  between  process  terms  we  follow  the  approach  advo¬ 
cated  in  [22].  We  define  both  strong  and  weak  barbed  equivalence  between  pro¬ 
cesses,  and  We  then  dictate  that  the  required  equivalence,  which  we  refer  to 
as  barbed  bisimulation  equivalence,  is  defined  (for  example  in  the  weak  case)  as:  P 
^  Q  if  and  only  if  for  every  suitable  context  C[  ],  C[P]  «  C[Q].  Although  this  may  be 
reasonable,  it  is  not  a  very  useful  definition;  the  reader  is  invited  to  determine  whether 


473 


the  following  pairs  of  processes  should  be  equivalent  or  distinguished. 


P\ 

Q\ 


(a  +  x)^|(a.fl)J\a 


P2  =  [(if  k  then  a  else  nil)^  |  (oc.t?)  J  \a 
Q2  -  (spawn(/:,fl))^ 


In  Section  3  we  define  two  bisimulation-based  relations,  strong  and  weak  Located- 
Failure  equivalence  {LF- equivalence)  and  show  that  these  coincide  with  the  indirectly 
defined  barbed  congruences.  Since  LF-equivalence  is  defined  using  bisimulations,  the 
problem  of  deciding  that  two  systems  are  semantically  congruent  can,  in  principle,  be 
solved  using  standard  proof  techniques  associated  with  bisimulation  [18].  However, 
constructing  an  LF-bisimulation  requires  that  one  consider  the  behavior  of  the  systems 
under  all  possible  sequences  of  kills,  by  both  the  systems  themselves  and  the  environ¬ 
ment.  The  number  of  states  that  must  be  explored  may  be  exponentially  larger  than  the 
number  needed  to  construct  a  CCS  bisimulation. 

In  Section  4  we  use  the  ideas  of  [15]  to  give  alternative  symbolic  characterizations 
of  LF-equivalence  that  can  be  decided  using  a  much  smaller  state  space.  The  idea  is  to 
replace  the  operational  judgments  L>  F  L!  with  judgments  of  the  form  P  P  , 

where  (p  is  a  logical  formula  that  describes  the  circumstances  under  which  the  action 
a  can  be  performed.  In  the  strong  case  the  required  logic  is  straightforward:  a  propo¬ 
sitional  logic  that  describes  the  state  (dead  or  alive)  of  the  sites  in  the  system.  In  the 
weak  case,  however,  we  require  a  more  complicated  logic  that  can  express  statements 
of  the  form  site  i  was  alive  at  some  point  in  the  past.  Using  these  symbolic  transitions, 
the  standard  definition  of  symbolic  bisimulation  [15]  requires  only  minor  modification 
to  capture  ~  and  hence  the  symbolic  proof  techniques  and  tools  of  [15]  may  be  used 
to  check  the  new  semantic  equivalences  proposed  in  this  paper. 

In  this  extended  abstract  we  have  omitted  several  formal  definitions  and  all  proofs. 
The  full  version  [21]  includes  additional  results  and  examples,  including  a  discussion 
of  basic  processes  and  comparisons  with  other  equivalences. 


2  The  Language 

The  syntax  of  processes  is  parameterized  with  respect  to  several  syntactic  sets.  We 
assume  a  set  Loc  of  locations  k,  i,  m,  a  set  PConst  of  process  constants  A  used  to  define 
recursive  processes,  and  a  set  Act  of  communication  actions  a,  b,  c,  such  that  every 
action  a  G  Act  has  a  complement  a  G  Act  (~  is  a  bijection  on  Act).  The  set  Actx  =  Act  U 
{x}  of  actions  a  includes  also  the  distinguished  silent  action  x.  The  formal  syntax  is  as 
follows.  Most  of  the  operators  should  be  familiar  from  CCS;  all  of  the  new  constructs 
have  been  described  in  the  introduction. 

p^q{^BProc)  a.p  |  spawn  (^,/7)  |  kill/./?  |  if  ^  then /?  else  ^  |  A  | 

I  pk  1  pV  I  p[/] 

P,Q{eLProc)  ::=  P\Q  j  P\a  |  P[f]  |  (p)< 

We  have  adopted  a  two-level  syntax  which  distinguishes  between  basic  processes  p 
and  located  processes  P.  Intuitively,  a  basic  process  corresponds  to  what  one  normally 
thinks  of  as  a  process:  a  collection  of  threads  of  computation  that  must  be  run  at  a  single 
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site.  A  located  process,  instead,  corresponds  to  a  distribution  of  basic  processes  over 
several  sites.  Note  that  many  basic  processes  may  be  located  at  a  single  site,  and  a  basic 
process  may  share  a  private  channel  (unknown  to  other  basic  processes  running  at  the 
same  site)  with  a  remote  process. 

The  ability  of  a  process  to  perform  an  action  is  dependent  on  the  set  of  live  loca¬ 
tions,  and  consequently  the  transition  relation  determining  the  operational  semantics 
is  defined  between  configurations.  A  liveset  L  is  any  subset  of  Loc.  A  configuration 
{L\>P)  is  a  pair  comprising  a  liveset  L  and  a  located  process  term  P.  The  set  of  all 
configurations  is  Config,  ranged  over  by  C  and  D. 

In  giving  the  intensional  semantics  of  processes,  it  will  be  convenient  for  later  de¬ 
velopment  if  we  distinguish  executions  of  the  operator  k\\\£.p  depending  upon  whether 
C  is  alive  or  dead  at  the  time  of  execution.  To  capture  this  distinction,  we  extend  the  set 
of  actions  to  the  set  KAct  =  Act  U  {kill i  \  £  G  Loc},  which  includes  the  kill  actions  kill£. 
Unless  otherwise  specified,  p  ranges  over  KAct^  =  KAct\j{'i}.  In  Table  1  we  define  the 
transition  relation  (-^)  C  Config  x  Config.  The  definition  uses  the  following  simple 
structural  equivalence  on  processes: 

{pk)i  =  =  ip)tV  {p[f])i  =  ip)ilf] 

While  the  transition  relation  — ^  distinguishes  effective  kill  actions  from  those  that 
have  no  effect,  a  basic  tenet  of  our  study  is  that  the  precise  moment  of  location  failure 
should  be  unobservable.  Thus  we  extract  from  — s-  a  transition  relation  i — ^  in  which 
all  kill  actions  have  been  replaced  with  silent  actions.  It  is  this  derived  relation  i — ^  that 
we  take  to  be  fundamental. 

Definition  1  (h— .).  c^C  iff  C~^C'  □ 

CAc'  iff  o' or  3*;  C C' 

Most  of  the  rules  in  Table  1  are  straightforward,  being  inherited  directly  from  CCS, 
modulo  the  constraint  that  the  process  (p)^  can  only  move  if  £  is  alive.  Note  that  the 
three  new  operators  —  kill,  spawn  and  the  conditional  —  are  modeled  as  T-transitions; 
this  reflects  the  fact  that  in  a  distributed  system  the  implementation  of  these  operators 
would  involve  some  computation  and  thus  the  passage  of  some  time. 

We  now  discuss  the  problem  of  defining  an  appropriate  semantic  equivalence  for  lo¬ 
cated  processes,  based  on  the  transition  relation  i — An  obvious  possibility  is  to  adapt 
the  bisimulation  equivalences  of  CCS  [18].  (Strong)  CCS-bisimulation  is  the  largest 
symmetric  relation  on  configurations  such  that  whenever  C  D  and  C  C‘ 
there  exists  a  D'  such  that/)  D'  and  C'  £)'.  A  weak  version  of  this  relation, 

can  be  obtained  by  adapting  this  definition  to  the  weak  transition  relation  defined  as 
usual.  To  see  that  CCS-bisimulation  is  not  suitable  for  our  language,  for  example  is  not  a 
congruence,  consider  the  processes  P3  [(a.a)^  |  (a)^]  \aand  03  =  [(a)^  |  (a.<3)^] \a. 
P3  distinguished  by  a  context  that  kills  location  £,  if  this 

kill  action  is  performed  after  the  initial  communication  on  a. 

The  use  of  for  CCS  has  been  justified  in  [22]  by  the  fact  that  it  coincides  with  the 
congruence  obtained  from  a  simple  notion  of  observation  called  barbed  bisimulation. 
Similar  results  have  been  obtained  for  lazy  and  eager  functional  languages  [1,  14,  7], 
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Table  1  Transition  system  with  configurations  (symmetric  rules  for  |  omitted) 


AcL) 

Lt>{a.p)^ 

L^{p)c 

if 

iei 

SpawHc) 

L>{spawr\{k,p))i  — ^ 

if 

iei 

Killlc) 

L<>{k\\\m.p)i-!^L\{m}>ip)t 

if 

let,  met 

Kil!2c) 

L>(k\\\m.p)f^  — ^ 

L^>{p)^ 

if 

Condlc) 

L>  (if  m  then  p  else  q)i  — ^ 

L>(p)( 

if 

£  ^  L,  m  ^  JL 

Cond2c) 

L[>  (if  m  then  p  else  q)^^ 

L>{q)i 

if 

£eL,m^L 

Sunric) 

if 

L>{pj)i 

^L’>ip'j),j€l 

Defc) 

Lt>{A)^  L'\>{p')f^ 

if 

L>{p)i 

-^L!>  (p%  ,A  =  p 

Strc) 

LoP-J^L'^Q 

if 

P 

L>P'  =  Q 

Parc) 

LoP\Q-^L'>P'\Q 

if 

L\>P-^L'>P' 

ComiTic) 

L>P\Q-^L'>P'\Q' 

if 

L>P-^ 

*L!t>P',L>Q-^L’!>Q' 

Restrc) 

L> P\a L' P'\a 

if 

L> P  L‘\>P',p^  {a, a} 

Rene) 

Lt>P{f]^L't>P'[f] 

if 

L>P-P- 

*L'>P' 

giving  further  evidence  for  the  reasonableness  of  this  approach.  Roughly,  two  processes 
are  barbed  bisimilar  if  every  silent  transition  of  one  can  be  matched  by  a  silent  transi¬ 
tion  of  the  other  in  such  a  way  that  the  derived  states  are  capable  of  exactly  the  same 
observable  actions;  in  addition,  the  derived  states  must  also  be  barbed  bisimilar.  For 
our  language,  the  formal  definition  is  as  follows. 

Definition  2  (Barbed  bisimulation).  Weak  barbed  bisimilarity  (f«)  is  the  largest  sym¬ 
metric  relation  over  configurations  such  that  whenever  C  ^  D:  (a)  C  ^  C‘  implies 
that  for  some  D',  and  C  ^  D’\  and  (b)  for  every  a,  implies  D 

Strong  barbed  bisimilarity  is  obtained  by  replacing  ^  by  i — ^  everywhere  in  the 
definition.  ^ 

Barbed  bisimulation  is  a  very  weak  relation;  for  example,  it  is  not  preserved  by 
parallel  composition.  However,  by  closing  over  all  contexts  we  arrive  at  a  reasonable 
semantic  equivalence  that  by  definition  enjoys  an  important  property,  namely  that  it  is 
a  congruence. 

Definition  3  (Barbed  equivalence).  Located  processes  P  and  Q  are  (weak)  barbed 
equivalent  (P  ^  Q)  if  for  every  context  C[  ]  such  that  C[P]  and  C[Q]  are  configura¬ 
tions,  C[/^]  «  €[Q].  Strong  barbed  equivalence  (~)  is  obtained  in  the  same  manner 
from  O 

Because  it  requires  quantification  over  all  contexts,  barbed  equivalence  is  difficult 
to  use  directly.  For  example  the  processes  P\  and  Q\,  given  in  the  introduction,  are 
distinguished  by  whereas  Pi  and  Qi  are  identified;  it  is  far  from  obvious  why.  Even 
worse,  processes  Ps  and  (given  in  Section  3)  are  related,  although  establishing  this 
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fact  requires  that  one  prove  that  P\  and  Q\  are  related  under  the  assumption  that  i  is 
alive  at  the  time  P\  and  Q\  are  compared,  that  is,  t  is  initially  alive. 

We  end  this  section  with  some  additional,  simpler  examples.  The  processes  {a)^  \ 
(h)(^  and  (Z?)^  |  can  be  distinguished  by  a  context  that  kills  i.  The  same  context  can 
be  used  to  distinguish  the  basic  processes  spawn(^,a)  and  spawn(/:,^7),  regardless  of 
where  they  are  located.  These  examples  indicate  that  although  the  location  of  an  action 
is  not  reflected  directly  in  the  operational  semantics  they  do  impinge  on  the  behavior  of 
processes.  The  order  in  which  kill  actions  are  executed  is  also  significant.  For  example 
kili^.kill/:  can  be  distinguished  from  kill/:.kill^  using  the  process  | 


3  Located-Failures  Equivalence 


In  this  section  and  the  next,  we  provide  alternative  characterizations  of  barbed  equiv¬ 
alence  for  our  language.  Note  that  if  Lt>P  L'>P',  then  L!  is  determined  by  L  and 
/i.  To  emphasize  this,  we  adopt  the  following  notation.  For  each  action  ju,  we  define 
a  function  “iafter^”  which  reflects  the  immediate  effect  of  action  //  on  a  liveset.  We 
also  define  the  relations  and  on  process  terms,  which  capture  the  capability  of 
action  }j  under  liveset  L. 

p^pf  ^  L>P^iafter^(L)>P' 
P^  P>  ^  LoP  iafterp(L)  op' 


iafter^(L)  = 


I  AW. 


\fp  =  killk 

\fpe  Ticru  {T,e} 


For  example,  iaftera(P)  =  L  for  any  a,  and  iafter^///^({^,/:})  =  {/:}.  If  P  =  (oc.^)^  |  (ot)^, 
then  P  nil,  but  P  has  no  a- transition  under  the  liveset  {Z:}. 

We  first  present  the  strong  case. 


Definition  4  (Strong  LF-equivalence).  Let  S  =  {^lIlcLoc  an  indexed  family  of  re¬ 
lations  on  LProc.  S  is  a  strong  IdP -bisimulation  if  for  every  L,  is  symmetric  and 
whenever  P  Q: 

(a)  P-‘^P'  implies  3Q' :  Q-^  Q:  and  P'  Siaftcr„(L)  G' 

(b)  for  every  keL  P  Q 

P  and  Q  are  strong  hV -equivalent  under  L  (P  Q)  if  there  exists  a  strong  LF- 
bisimulation  S  with  P  $i  Q. 

P  and  Q  are  strong  hV- equivalent  (P  ~  Q),  if  P  <2  for  every  subset  L  of  Loc.  □ 

In  the  full  paper,  we  prove  that  c::;  and  ~  coincide.  The  alternative  characterization 
of  weak  barbed  equivalence  is  more  complicated:  it  is  not  sufficient  to  change  the  strong 
arrows  in  Definition  4  to  weak  arrows.  To  see  this,  consider  the  following  processes: 


P5=[Ap.a+6.  (a+x))^|(p.(a+t.a)  +  a.a)J  \«\P 

Qs=  [A(a+T))J 
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If  £  is  initially  dead,  P5  and  Qs  are  clearly  equivalent:  both  are  strong  equivalent  to  nil. 
If  ^  is  initially  alive,  however,  the  situation  is  not  so  clear.  The  questionable  move  is 
^-transition  to  Pi  ~  [  (a)^  [  (a  -f  ]  \a.  To  match  this  move  Q5  must  perform  a  weak 
6-transition  to  ~  [  (a  +  x)J  (a.fl)^]  \a.  But  Pi  and  Qi  are  not  barbed  equivalent:  if 
£  is  dead,  then  Q\  is  capable  of  a  a  transition  that  Pi  cannot  match.  This  would  lead  one 
to  believe  that  P  and  Q  are  not  barbed  equivalent;  however,  they  are. 

Intuitively  this  is  true  because  when  P5  reaches  Pi,  I  must  be  alive;  thus  Pi  and  Q\ 
need  only  be  compared  under  the  constraint  that  i  is  initially  alive.  Once  this  compar¬ 
ison  has  begun,  the  environment  can  distinguish  Q\  from  Pi  only  by  killing  I,  but  it 
cannot  control  internal  activity  on  the  part  of  Pi  before  I  is  dead. 

Definition  5  (Weak  LF-equivalence).  For  define  ju  such  that  a-aandx  =  E. 

The  definition  of  ~  is  similar  to  that  for  except  that  when  P  Q,  we  require: 

(a)  P  P'  implies  3Q' :  Q  ^  Q‘  and  P'  Q! 

(b)  for  every  A:  GL  :  0  ^  G' and  P  O'  O 

Whereas  the  first  clause  in  the  definition  of  weak  LF-bisimulation  is  as  one  would 
expect,  the  second  clause  is  somewhat  surprising.  It  says,  in  effect,  that  if  the  environ¬ 
ment  kills  a  location  k,  then  Q  must  be  able  to  (silently)  evolve  to  a  process  <2'  that 
matches  P;  but  in  reaching  0',  Q  may  exploit  the  intermediate  states  of  the  system  (that 
is,  k  alive,  then  k  dead). 

Theorem  6.  For  all  located  processes  P^Q  if  and  only  ifP  ^  Q.  □ 

4  Symbolic  characterizations 

While  the  LF-equi valences  provide  a  great  deal  of  insight  into  the  meaning  of  barbed 
equivalence  in  distributed  process  description  languages  such  as  ours,  they  are  unwieldy 
to  use  in  practice.  For  the  most  part,  this  is  due  to  the  use  of  configurations  in  the  opera¬ 
tional  semantics.  In  this  section,  we  improve  this  situation  by  defining  a  symbolic  tran¬ 
sition  system  directly  on  located  process  terms,  then  giving  characterizations  of  strong 
and  weak  LF-equivalence  using  these  symbolic  transitions.  As  one  should  expect,  the 
weak  case  is  quite  a  bit  more  subtle  than  the  strong. 

We  begin  by  giving  the  symbolic  operational  semantics.  The  symbolic  transition 
relation  makes  use  of  propositional  formulae  Ti,  p,  which  are  given  a  semantics  in  terms 
of  livesets.  Intuitively,  a  formula  indicates  a  set  of  constraints  on  the  status  of  locations 
(dead  or  alive)  at  the  time  that  the  transition  is  enabled.  If  P  P"  then  if  location  0 
is  dead  and  1  is  alive,  P  is  capable  of  making  an  //-transition  to  P';  that  is,  if  0  ^  L  and 
1  G  L  then  P  P’ .  In  Table  2  we  define  the  transition  relation  C  LProc  x  LProc. 
The  two  transition  systems  are  related  by  the  fact  that  P  P'  if  and  only  if  there  exists 
a  %  such  that  P  P'  and  P 1=  7i. 

The  standard  definition  of  symbolic  bisimulation  [15]  requires  that  we  define  entail- 
ment  between  formulae,  which  we  do  in  the  standard  way: 


Tcihp  iff  VL:  L  1=  7t  implies  L 1=  p 
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Table  2  Symbol 

ic  transition  system  (symmetric  rules 

for  1  omitted) 

Acts) 

{a.p),  (p)i 

Spawns) 

(spawn(/:,p))^-|^(p)^ 

Killl,) 

Kill2s) 

Condls) 

(if  m  then /p  else  ?)<  77^;+ (p)/ 

Cond2s) 

(if  m  then  p  else 

Sums) 

{^iaPi)t  ^P'j)t 

if 

ipj)i 

Defs) 

(A),-^(p')t 

if 

(p)t (p')t .  =% 

Strs) 

P^Q 

if 

P  =  P‘,P‘  -i^Q',Q!  =  Q 

Pars) 

PlQ-h^lQ 

if 

P^r 

Conrirris) 

PlQ^riQ’ 

if 

P-^P',Q-^Q' 

Restrs) 

P\a  P\a 

if 

P-^  P' 

Rens) 

P[f]^P’[f] 

if 

P^P' 

Note  that  entailment  is  a  preorder  on  formulae.  If  Ih  p  we  say  that  n  is  stronger  than 
p.  ff  is  the  strongest  formula  under  Ih,  tt  the  weakest. 

We  must  also  identify  a  set  of  formulae  suitable  as  parameters  in  the  recursive  defi¬ 
nition  of  symbolic  equivalence,  that  is,  the  analogs  of  the  parameters  L  in  the  definition 
of  LF-equivalence.  Intuitively,  when  we  say  that  P  and  Q  are  LF-equivalent  under  L,  we 
are  limiting  attention  to  a  single  possible  world,  namely  that  in  which  exactly  the  sites  in 
L  are  alive.  The  idea  of  symbolic  equivalences,  instead,  is  to  treat  many  possible  worlds 
simultaneously  (via  entailment).  In  the  case  of  strong  LF-bisimulation,  where  P  0 
and  M  CL  imply  P  this  is  achieved  by  restricting  attention  to  negative  formulae 

—  formulae  which  contain  no  positive  atoms  —  in  the  recursive  definition  of  symbolic 
equivalence.  Finally,  we  identify  a  transformation  on  formulae  (indexed  by  actions) 
which  specifies  the  conditions  under  which  residual  processes  are  to  be  compared: 

M  1=  aftera(p)  iff  3L;  L  t=  p  and  M  CL 

M  ^  Sifter iciiikip)  iff  3L:  L  (=  p  and  M  C  L\{k} 

Definition  7  (Strong  symbolic  bisimulation).  Let  8  be  a  family  of  relations  on  LProc 
indexed  by  negative  formulae  8  is  a  strong  symbolic  bisimulation  if  for  every 'd,  8^ 
is  symmetric  and  whenever  PSf^Q  and  P  P'  then  for  some  Tt/,  p/,  and  Qy. 

(a)  dATcll- ViPi,  (c)  2  ^60  and 

(b)  p,'  Ih  Jt,-,  (d)  e  Safter„(p,)  Qi 

We  write  P  Q  to  indicate  that  there  exists  a  symbolic  bisimulation  8  with  P  8^  □ 

Theorem  P  Q  iff  P  <2  L  (=  d.  In  addition,  (~)  =  (— »)•  ^ 
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As  a  first  attempt  to  define  weak  symbolic  bisimulation,  let  us  try  simply  replacing 
the  strong  transitions  in  Definition  7  with  weak  edges  defined  by  conjoining  formulae. 
For  example,  we  would  have  P  P  and  P  P'  if  P  ^ ■  Unfortunately, 

this  definition  does  not  suffice.  Consider  the  processes  P5  and  Q5,  previously  defined; 
these  have  the  following  symbolic  transition  graphs  (where  we  have  write  as 


% 


Cl 

.A 


Qs 

Q\  X/aa- 


As  noted  in  Section  3,  in  order  to  prove  these  processes  equivalent  we  must  compare 
the  processes  Pi  and  Ci  under  the  assumption  that  I  is  initially  alive,  but  using  our 
provisional  definition  we  would  end  up  comparing  Pi  and  Cl  under  the  assumption 
tt  neg(^  A  k),  which  is  not  strong  enough  to  prove  that  they  are  related. 

As  a  second  attempt,  we  might  simply  allow  all  positive  information  to  carry  over 
into  the  recursive  formula  i3,,  that  is,  change  the  last  clause  of  Definition  7  to  i3/  =  p/. 
Whereas  our  first  attempt  produced  an  equivalence  that  was  too  strong,  the  revised 
definition  is  too  weak.  For  example,  the  following  processes  would  be  identified  even 
though  they  are  not  barbed  equivalent. 


Pb=  [(«.«)( I  (oc)*]\a 
Pb 


26=  [(«)<|(a.a)*]\0( 
Q'b 


Here  and  C6  would  be  compared  under  the  formula ^  A/:.  This  formula,  however, 
says  something  more  than  we  would  like,  namely  that  i  and  k  remain  alive  until  P^  and 
C5  execute  their  first  action.  More  complicated  examples  can  be  constructed  to  show 
that  we  must  be  able  to  express  properties  such  as  and  k  must  have  been  alive,  then  i 
must  have  died,  and  after  that  k  must  have  died.” 

Our  solution  is  to  define  weak  symbolic  edges  using  a  past-time  temporal  logic  [\1], 
interpreted  over  sequences  of  livesets.  A  live  sequence  /C  is  a  finite  nonempty  sequence 
of  livesets  {Li , . . . ,  P,i),  such  that  for  every  i  between  1  and  «  -  1  there  exists  a  location 
k  such  that  L/+i  =  Li\{k},  For  example,  is  a  live  sequence,  but  ({i}A^}) 

and  {{i,k],0)  are  not.  We  write  £.(/)  for  the  i^^  element  of  £  and,  where  clear  from 
context,  use  n  to  refer  to  the  length  of  £.  Thus,  for  example,  £  models  ^  if  ^  ^  £(,j) 
and  £  models  if  £  or  some  prefix  of  £  models  (p.  Becausejive  sequences  must  be 
strictly  decreasing,  f  A  is  unsatisfiable;  however  {{£},  0)  1=  ^  A  Weak  symbolic 
transitions  are  defined  as  follows: 


^((pATC) 

P  - ^ 

(pATI 


P"  ifP 


P'  ifP=^ 


P 

P 


_ QL 

"cpTof 


killk 

7A©((pA;i) 


>P'  ifP 
^P"  ifP 


PC  '  p/ 
Tt  ^ 
killk  pi 
Tt  ^  ^ 
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Intuitively  P  =|^  P’  means  that  P  can  perform  the  action  fj  to  become  P'  in  an  environ¬ 
ment  where  the  change  in  live  sets  satisfies  the  formula  9.  For  example  if  (pi  =  A  ^ 
and  (p2  =  A  A:) ;  then  Pe  has  the  symbolic  transition  ^  but  not  whereas  for  (26 
it  is  the  opposite. 

As  parameters  to  the  weak  relation  we  simply  take  Boolean  formulae,  but  now  in¬ 
terpreted  on  the  initial  liveset  of  a  live  sequence.  Rather  than  use  two  logics  in  the 
definition  or  introduce  additional  operators,  we  define  the  function  “initially”  which 
converts  Boolean  formulae  into  temporal  formulae  with  this  interpretation  in  mind.  The 
transformation  function  for  generating  formulae  after  an  action  is  performed,  which  we 
call  “finally”,  must  then  transform  temporal  formula  into  propositional  ones.  The  defi¬ 
nitions  are  as  follows:  (In  the  full  paper,  we  show  how  to  calculate  these  functions.) 

/C  initially(7c)  iff 

M  ^finally((p)  iff  3/L  :  ^(pandM  = 

Definition  9  (Weak  symbolic  bisimulation).  Similar  to  Definition  7,  except  that  when 
P  SjT  <2  and  P  =!>  P’  we  require: 

(a)  initially(7i)  A  (p  If-  V/V/.  (c)  Q  ^  Qh  and 

(b)  \}//IH(p/,  (d)  P' ^LaWyi^fi)  Qi 

Theorem  10.  P^lQ  iff  P  Q.  In  addition,  (~)  =  (~tt)- 

5  Conclusions 

In  this  paper  we  have  proposed  a  new  semantic  theory  for  distributed  systems  which 
takes  into  account  the  possibility  of  failures  at  sites.  This  theory  is  an  adaptation  of 
standard  bisimulation-based  theories  [18]  using  an  operational  semantics  for  located 
processes.  The  new  semantic  equivalences  are  justified  in  terms  of  barbed  bisimulations 
[22].  We  also  give  symbolic  characterizations  of  the  new  equivalences,  which  means 
that  they  can  be  investigated  using  the  symbolic  methods  of  [15]. 

Site  failure  has  also  played  a  role  in  languages  studied  in  [2,  4,  12].  In  these  papers 
abstract  languages  based  on  Facile  [13]  or  the  pi-calculus  [19,  5]  are  studied.  The  orig¬ 
inal  motivation  for  this  paper  was  to  provide  an  alternative  characterization  of  barbed 
equivalence  for  languages  such  as  these.  Although  we  have  not  treated  value  passing 
or  references,  we  postulate  that  our  results  can  be  extended  in  a  straightforward  way  to 
value-passing  languages  which  retain  the  assumption  that  all  failures  are  independent, 
such  as  the  languages  in  [2,  4].  More  delicate  is  the  extension  to  languages  such  as  the 
distributed  join-calculus  [12]  in  which  the  independence  assumption  is  dropped.  In  this 
case  the  logical  language  used  for  symbolic  bisimulations  must  be  extended  to  allow 
statements  about  the  interdependence  of  locations;  we  leave  this  to  future  work. 

A  number  of  location-based  equivalences  already  exist  in  the  literature  [8, 9, 20,  10]; 
however,  none  of  these  theories  addresses  the  possible  failure  of  sites.  Their  empha¬ 
sis,  rather,  is  to  define  a  measure  of  the  concurrency  or  distribution  of  a  process:  two 
processes  are  deemed  equivalent  only  if,  informally,  they  have  the  same  degree  of  con¬ 
currency.  In  the  full  paper  we  give  a  series  of  counter-examples  which  show  that  ~ 
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is  incomparable  with  all  of  the  equivalences  proposed  in  these  papers;  we  also  discuss 
variations  on  the  language  and  model  of  failure. 
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Abstract.  We  propose  a  general  approach  to  define  behavioural  pre¬ 
orders  over  process  terms  by  considering  the  pre-congruences  induced 
by  three  basic  observables.  These  observables  provide  information  about 
the  initial  communication  capabilities  of  processes  and  about  their  possi¬ 
bility  of  engaging  in  an  infinite  internal  chattering.  We  show  that  some  of 
the  observables-based  pre-congruences  do  correspond  to  behavioral  pre- 
orders  long  studied  in  the  literature.  The  coincidence  proofs  shed  light  on 
the  differences  between  the  must  preorder  of  De  Nicola  and  Hennessy  and 
the  fair/should  preorder  of  Cleaveland  and  Natarajan  and  of  Brinksma, 
Rensink  and  Vogler,  and  on  the  role  played  in  their  definition  by  tests 
for  internal  chattering. 


1  Introduction 

In  the  classical  theory  of  functional  programming,  the  point  of  view  is  assumed 
that  executing  a  program  corresponds  to  evaluating  it.  If  we  write  M  4-  to 
indicate  that  program  M  evaluates  to  value  u,  the  problem  of  the  equivalence  of 
two  programs,  hence  of  their  semantics,  can  be  stated  as  follows: 

Two  programs  M  and  N  are  observationally  equivalent  if  for  every  pro¬ 
gram  context  C  such  that  both  C[M]  and  C[A^]  are  programs,  and  for 
every  value  v,  we  have:  ClM]  4.  u  if  and  only  if  ^[A^]  4^ 

An  alternative  approach,  used  e.g.  for  the  lazy  lambda  calculus  [1],  is  that  of 
defining  a  simulation  (whose  kernel  is  an  equivalence)  based  on  the  reduction  to 
normal  forms.  In  general,  given  a  language  equipped  with  a  reduction  relation, 
the  paradigm  for  defining  equivalence  over  terms  of  the  language,  can  be  traced 
back  to  Morris  [16]  and  can  be  phrased  as  follows: 

1.  Define  a  set  of  observables  (values,  normal  forms,  . . .)  to  which  a  program 
can  evaluate  by  means  of  successive  reductions. 

2.  Consider  the  largest  (pre-) congruence  over  the  (set  of  operators  of  the)  lan¬ 
guage  induced  by  the  chosen  set  of  observables. 

This  paradigm  has  been  the  basis  for  assessing  many  semantics  of  sequential 
languages  and  is  at  the  heart  of  the  full  abstraction  problem,  see  e.g.  [18]. 

Here,  we  aim  at  taking  advantage  of  this  paradigm  also  to  assess  models  of 
concurrent  systems  and  their  equivalences.  In  this  case,  the  choice  of  the  basic 
observables  is  less  obvious.  On  one  hand,  it  is  well-known  that  input/output 

*  Work  partially  supported  by  EEC:  HCM  project  EXPRESS,  by  CNR  project  “Speci- 
fica  ad  alto  livello  e  verifica  formale  di  sistemi  digital!”  and  by  Istituto  di  Elaborazione 
deirinformazione  CNR,  Pisa. 
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relations  are  not  sufficient  for  describing  the  semantics  of  these  classes  of  systems, 
and  thus  it  would  be  limitative  to  use  values  as  observables.  On  the  other  hand, 
studying  the  evolution  to  normal  forms  under  all  possible  contexts  is  not  as 
inspective  as  in  the  case  of  lambda  calculus.  Indeed,  the  interaction  between  a 
A-term  and  the  environment  is  circumscribed,  while  that  between  a  process  and 
its  environment  is  less  clear. 

If  we  consider  the  A-term  MTV,  we  know  the  extent  of  the  influence  of  N  over 
M,  and,  in  any  computation,  we  know  exactly  when  an  interaction  between  M 
and  N  occurs,  namely  when  M  reduces  to  a  A-abstraction.  inus  by  observing  M 
in  all  possible  contexts  we  can  fully  understand  its  behaviour.  When  considering 
concurrent  systems,  the  internal  evolution  of  each  parallel  component  is  freely 
intermingled  with  external  communications.  Then  understanding  the  semantics 
of  a  component  via  its  contextual  behaviour  turns  out  to  be  much  less  obvious. 

Here,  we  shall  consider  a  simple  process  description  language,  TCCS  (Tau- 
less  CCS  [7]),  and  will  study  the  impact  of  three  basic  observables  for  concurrent 
systems  on  this  language.  However,  our  results  are  easily  extensible  to  general 
SOS  language  formats,  like  GSOS  [2]. 

We  shall  be  interested  in  testing  for  the  initial  guaranteed  communication 
capabilities  of  a  system.  Indeed,  when  one  is  willing  to  infer  the  interactive 
behaviour  of  a  system  from  its  “isolated”  behaviour,  to  know  about  the  system’s 
possibility  of  accepting  communications  along  specific  channels  is  not  sufficient: 
due  to  the  inherent  nondeterminism  of  concurrent  computations,  it  is  necessary 
to  know  whether  the  acceptance  of  the  communications  is  guaranteed.  This  is 
essential  to  establish  liveness  properties,  like  the  absence  of  deadlock. 

Moreover,  we  shall  be  interested  in  the  risk  a  system  has  of  getting  involved  in 
an  infinite  sequence  of  internal  communications  (to  diverge),  because  this  could 
lead  to  ignoring  all  subsequent  external  stimuli.  Finally,  with  respect  to  this,  it 
might  also  be  important  to  know  the  external  communications  that  can  lead  to 
divergent  states. 

These  considerations  guide  us  to  introducing  three  basic  observables: 

1.  P\i  {P  guarantees  i)  asserts  that,  by  internal  actions,  P  can  only  reach 
states  from  which  action  i  can  be  eventually  performed; 

2.  P  i  {P  converges)  asserts  that  P  cannot  get  involved  in  an  infinite  sequence 
of  internal  actions; 

3.  P  It  {P  converges  along  i)  asserts  that  P  converges  and  does  so  also  after 
performing  i. 

For  finite  process  graphs  these  observables  are  obviously  decidable;  in  general, 
they  are  not,  but  this  is  somehow  expected  since  the  basic  language  (TCCS)  is 
Turing  powerful. 

We  shall  analyze  the  impact  of  the  above  predicates  on  the  semantics  of 
TCCS.  The  predicates  naturally  induce  five  contextual  preorders.  These  pre¬ 
orders  are  listed  in  Table  1;  there  we  represent  a  contextual  preorder  using  the 
notation  ’  where  si  (if  present)  refers  to  the  used  convergence  predicate, 

and  S2  (if  present)  refers  to  the  guarantees  one.  The  universal  relation  is  denoted 
by  U. 
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con  V. /comm. 

no  req. 

4 

conv./comm. 

no  req. 

4 

a 

no  req. 

U 

no  req. 

U 

□ 

771 

c 

m 

\e 

r 

'^FT 

□ 

C 

Table  1.  Contextual  Preorders  Table  2.  Main  results 


Our  main  results  are  five  full  abstraction  theorems  that  make  it  manifest 
that  our  contextual  preorders  do  coincide  with  well-known  and/or  intuitive  be¬ 
havioural  preorders  over  processes  studied  in  the  literature.  Table  2  provides  a 
summary  of  the  claimed  results. 

More  specifically,  we  will  show  that: 

-  ,  the  contextual  preorder  induced  by  !  i,  coincides  with  ,  the  max¬ 

imal  pre-congruence  included  in  the  fair/should  preorder  of  [17]  and  [3]. 
This  pre-congruence  can  be  characterized  (see  [4])  as  the  conjunction  of 
the  classical  trace  preorder  (called  may  preorder  in  [6])  with  the  fair/should 
preorder; 

-  and  ,  the  contextual  preorders  induced  by  4-  and  4  both  coincide 

^ith  £  ,  the  (reverse)  inclusion  of  the  convergent  traces  preorder,  a  simple 

variant  of  the  trace  preorder. 

Together  with  the  impact  of  the  three  observables  used  in  isolation  we  also 
study  the  result  of  their  conjunctions  and  show  that: 

-  ,  the  contextual  preorder  induced  by  |  and  !  coincides  with  ,  the 
original  must  preorder  of  [6,  10]; 

-  ,  the  contextual  preorder  induced  by  4  ^  and  !^,  gives  rise  to  a  new 
preorder,  the  safe-must  preorder  ,  which  is  supported  by  a  very  intuitive 
testing  scenario. 

The  safe-must  preorder  has  a  direct  characterization  in  terms  of  compu¬ 
tations  from  pairs  of  observers  and  processes:  a  computation  is  successful  if  a 
success  state  is  reached  before  a  catastrophic  one  (this  explains  the  adjective 
‘safe’).  This  notion  certainly  deserves  further  investigation. 

In  the  rest  of  the  paper,  we  recall  syntax  and  operational  semantics  (Sec.  2) 
and  introduce  an  observational  semantics  (Sec.  3)  for  TCCS,  then  we  present 
our  full  abstraction  results  (Sec.  4),  compare  the  semantic  preorders  (Sec.  5) 
and  briefly  discuss  related  work.  Due  to  space  limitations,  most  proofs  have 
been  omitted;  they  can  be  found  at  http://dsi2.dsi.unif i . it/'-^denicola. 

2  Tau-less  CCS:  TCCS 

In  this  section,  we  briefly  present  the  syntax  and  the  operational  semantics  of 
TCCS,  (r-less  CCS  [7,  10]).  We  have  preferred  to  use  TCCS  rather  than  CCS 
because  it  allows  us  to  avoid  the  “congruence  problems”  that  arise  when  the  CCS 
choice  operator  (+)  is  used  and  silent  actions  are  abstracted  away.  It  is  worth 
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mentioning  that  the  very  same  results  can  be  obtained  by  using  CCS  and  its 
must  pre-congruence  obtained  from  the  must  preorder  by  imposing  that  when¬ 
ever  the  “better”  process  can  perform  a  silent  move  also  the  other  can  do  it. 

We  assume  an  infinite  set  of  names  Af,  ranged  over  by  and  let 

77  =  {a  \  a  e  A/},  ranged  over  by  a,b^. be  the  set  of  co-names,  Af  and  A7 
are  disjoint  and  are  in  bijection  via  the  complementation  function  {");  we  define; 
(a)  =  a.  We  let  C  =  Afu77,  ranged  over  by  b_e  the  set  of  labels]  we  shall 

use  B  to  range  over  subsets  of  £  and  we  define  B  =  {I  \  i  e  B}.  We  also  assume 
a  countable  set  X  of  process  variables,  ranged  over  by  X,Y,  — 

Definition  1.  The  set  of  TCCS  terms  is  generated  by  the  grammar: 

E  :=  0  \  a  \  tE  \  E[]F  |  E^F  \  E\F  \  E\L  \  E{f}  \  X  \  recX.E 

where  f  :  C  C,  called  relabelling  function  is  such  that  {t  \  f{i)  7^  i)  is  finite, 
/(a)  e  Af  and  f(i)  —  /(£).  We  let  V,  ranged  over  by  P,  Q,  etc.,  denote  the 
set  of  closed  terms  or  processes  (i.e.  those  terms  where  every  occurrence  of  any 
agent  variable  X  lies  within  the  scope  of  some  recX..  operator). 

In  the  following,  we  often  shall  write  i  instead  of  £.0.  We  write 
foi"  fhe  relabelling  operator  _{/}  where  /(£)  =  if  ^  =  ii, 
i  E  {1, . . .  ,n},  and  f{€)  =  I  otherwise.  As  usual,  we  write  E[Ei/Xi, . . . , 
for  the  term  obtained  by  simultaneously  substituting  each  occurrence  of  Xi  in 
E  with  Ei  (with  renaming  of  bound  process  variables  possibly  involved) . 

The  structural  operational  semantics  of  a  TCCS  term  is  defined  via  the  two 
transition  relations  — >  and  induced  by  the  inference  rules  in  Table  3  and  in 
Table  4,  respectively.  The  symmetrical  versions  of  rules  AR4  and  AR5  in  Table  3 
and  of  rules  IR5,  IR6  and  IR7  in  Table  4  have  been  omitted. 


IKI  B  ^  n 


IR3 


P  P' 


P{f}  ^  P'{f} 
IRS  P  ©  Q  ^  P 


IR7 


P' 


P\Q  ^  P'\Q 


IR2  recX.E  > — >  E[recX.E/X] 


IR4 

IR6 


P  P[ 

P\L  P'\L 

P  P' 

P[]Q^  P'[]Q 


IR8 


P  P',  Q  -^Q 
P\Q  P'\Q' 


Table  4.  SOS  rules  for  TCCS:  Internal  Relation 
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As  usual,  we  use  or  to  denote  the  reflexive  and  transitive  closure 
of  >— >  and  use  ^  ,  with  se  JC+ Jot  ^  when  s  =  is'.  Moreover, 

we  write  P  ^  for  3F'  :  P  ^  P'  (P  -4  and  P  >  will  be  used  similarly). 
We  will  call  sort  of  P  the  set  sort{P)  =  {i  e  C  \  3s  e  C*  :  P  },  successors 
of  P  the  set  5(P)  =  {i  e  C  \  P  },  and  language  generated  by  P  the  set 
L{P)  =  {s  G  C*  1  P  ^  }.  Note  that  since  we  only  consider  finite  relabelling 
operators,  every  TCCS  process  has  a  finite  sort. 

A  context  is  a  TCCS  term  C  with  one  free  occurrence  of  a  process  variable, 
usually  denoted  by  _.  If  C  is  a  context,  we  write  C[P]  instead  of  C[P/_].  The 
context  closure  of  a  given  binary  relation  P  over  processes,  is  defined  as: 
PTZ^Q  iff  for  each  context  C,  C[P]'JIC[Q].  P^  enjoys  two  important  properties: 
(a)  (P"')''  =  P^  and  (b)  11  C  W  implies  C  P'^  In  the  following,  we  will 
write  ^  for  the  complement  of  P. 

3  Observational  Semantics 

In  this  section,  we  introduce  different  observational  semantics  for  TCCS;  we 
follow  two  approaches.  The  first  approach  takes  advantage  of  basic  observables, 
the  second  one  of  the  classical  testing  scenario  of  [6,  10]  and  variants  of  it. 

3.1  Basic  Observables  and  Observation  Preorders 

Definition  2.  Let  P  be  a  process  and  i  G  C.  We  define  three  basic  observation 
predicates  over  processes  as  follows: 

-  P !  £  (P  guarantees  i)  iff  VP'  :  P  P'  implies  P'  ^  ; 

-  P  {P  converges)  iff  there  is  no  infinite  sequence  of  internal  transitions 
P  > — >  Pi  > — >  •  ■  •  starting  from  P; 

-  P  mP  converges  along  t)\S.P  I  and  'iP'  :  P  P'  implies  P'  4-. 

The  above  predicates  can  be  combined  in  five  sensible  ways  and  used  to 
define  the  corresponding  basic  observation  preorders  over  processes,  as  stated  in 
the  following  definition. 

Definition  3.  Let  P  and  Q  be  processes. 

-  P  JQ  P  i  implies  Q 

-  T*  Q  iff  for  each  i  G  C:  P  i  i  implies  Q  i  i; 

-  P  Q  iff  for  each  iG  C:  P\i  implies  Q  ! i; 

-  P  JcQ  for  each  ^  E  £:  P  J.  and  P  !  ^  implies  Q  |  and  Q  !  ^; 

-  P  if[  ioT  each  i  G  C:  P  i  i  and  P !  i  implies  Q  li  and  Q  !  i. 

Of  course,  the  basic  observation  preorders  are  very  coarse.  More  refined  rela¬ 
tions  can  be  obtained  by  closing  the  above  preorders  under  all  TCCS  contexts. 
For  each  basic  observation  preorder,  say  the  contextual  preorder  generated  by 
^  is  defined  as  its  closure 
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3.2  Testing  Preorders  and  Alternative  Characterizations 

Like  in  the  original  theory  of  testing  [6,  10],  we  have  that: 

-  observers,  ranged  over  by  0,0' are  processes  capable  of  performing  an 
additional  distinct  “success”  action  w  ^  C‘, 

-  computations  from  P  \  O  are  sequences  of  internal  transitions  P  \  O 

Pi  I  Oi  > — )-  •  •  which  are  either  infinite  or  such  that  Pk  |  Ok  >/-^  ,  k  >0. 

Definition  4.  Let  P  be  a  process  and  O  be  an  observer. 

1.  P  must^^  0  if  for  each  computation  from  P  |  O,  say  P  |  O  > — >  Pi  |  Oi  > — v  •  • 
there  is  some  i  >  0  s.t.  Oi  . 

2.  Pmust^  O  if  for  each  computation  from  P  |  O,  say  P  |  O  > — Pi  |  Oi  •  •  •, 
there  is  some  i  >  0  s.t.  Oi  and  Pi 

3.  Prrmstp  O  if  for  each  computation  from  P  |  O,  say  P  |  O  ^  Pi  |  Oi  > — )•  •  •  •, 
it  holds  that  Pi  |  Oi  for  each  i  >  0. 

The  first  definition  of  successful  computation  given  above  is  exactly  that 
of  [6].  The  second  one,  considers  successful  only  those  computations  in  which  a 
success  state  is  reached  before  the  observed  process  diverges.  The  third  definition, 
which  is  essentially  taken  from  [3],  totally  ignores  the  issue  of  divergence.  These 
three  notions  allow  us  to  define  three  preorders:  the  first  one  (  is  the  original 
must  preorder  of  [6,  10],  the  second  one  (  is  the  new  safe-must  preorder  and 
the  third  one  (  is  the  (reverse  of  the)  fair/should  preorder  of  [17]  and  [3]. 

Definition  5.  Let  i  G  {M,  S,  F}.  For  all  processes  P  and  Q,  P  iff  for  every 
observer  0:  P  must^  O  implies  Q  must^  O. 

We  introduce  below  alternative  characterizations  of  the  preorders  must  and 
safe-must.  They  support  simpler  methods  for  proving  (or  disproving)  that  two 
processes  are  behaviourally  related.  We  need  some  additional  notation. 

Definition  6.  Let  s  ^  C* ,  B  C  and  5  be  a  set  of  processes. 

-  The  convergence  predicate,  4-  s,  is  defined  inductively  as  follows:  P  |  e  if  P  4; 

Pies'  a  Pie  and  VP' ;  P  ^  P'  implies  P'  4,  s'. 

We  write  P  f  s  if  P  i  s  does  not  hold. 

-  {P  after  s)  denotes  the  set  of  processes  {P'  :  P  P'}. 

~  We'^i^e  P  i  P  if  G  P  :  P  4  ^  and  5  I  P  if  VP  G  5  :  P  ;  P. 

-  P\B  stands  for  VP'  :  P  P'  implies  3^  G  P  :  P'  . 

“  5  4^  ^  stands  for  VP  G  5  :  P  4  P  and  P  !  P. 

Definition  7.  For  all  processes  P  and  Q,  we  write 

-  P  Q  if  Vs  G  P*  such  that  Pis,  it  holds  that: 

(a)  (3  4-5,  and  (b)  for  every  P  C:  {P  after  s)  !P  implies  {Q  after  s) !  P. 

-  P  <^5  <3  is  the  same  as  above  but  predicate  !  is  replaced  by  4!  ■ 
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Theorems.  For  all  processes  P  and  Q,  (1)  P  Q  iff  P  Q  and  (2) 

P  S^QiffP  «s  Q- 

By  taking  advantage  of  the  above  alternative  characterizations  it  is  easy  to 
prove  that  the  must  and  the  safe-must  preorders  are  pre-congruences. 

Theorem  9.  For  all  processes  P  and  Q  and  i  6  {M,  5),  P  ^.Q  iff  P  £^<3* 

Note  that  the  congruence  result  does  not  hold  for  the  fair/should  preorder 
£  ,  it  is  not  preserved  by  the  recursion  operator.  This  can  be  easily  seen  by 
considering  the  following  counter-example.  Consider  the  processes  P  =  a.b[]a.c 
and  Q  =  a.b  and  the  context  C  —  recX.{-\a.b.X)\{a,b}.  It  obviously  holds  that 
P  Q,  but  C[P]  C[Q]  (just  take  O  ^  c.w)]  hence  P  Q. 

An  alternative  characterization  of  the  closure  of  the  fair /should  preorder  is 
given  in  [4],  for  a  language  slightly  different  from  ours. 

Definition  10.  For  all  processes  P  and  Q,  we  write 

P  Q  if  (P  E,  Q  and  L{P)  C  L(Q)). 

Theorem  11.  For  all  processes  P  and  Q,  P  0  iff  P  E^  Q- 

4  Full  Abstraction  Results 

From  now  on,  we  adopt  the  following  conventioni  an  action  declared  fvcsh  in  a 
statement  is  supposed  to  be  different  from  any  other  name  and  co-name  men¬ 
tioned  in  the  statement. 

4.1  Convergence  predicate  and  convergent  traces 

In  this  section,  we  deal  with  the  first  two  contextual  preorders,  and  •> 
and  prove  that  they  have  the  same  distinguishing  power  and  coincide  with  the 
reverse  inclusion  of  the  convergent  traces  preorder. 

Definition  12.  For  all  processes  P  and  Q,  we  write  P  Q  if  Vs  €  C*  such 
that  Pis,  it  holds  that: 

a)  <3  i  s,  and 

b)  s  e  L{Q)  implies  s  G  P(P). 

Theorem  13.  For  all  processes  P  and  Q,  P  g^  Q  iff  P  E^  Q- 

The  following  special  contexts  can  be  used  to  prove  the  next  theorems.  If 
s  G  say  s  =  •  •  •  4  {n  >  0),  we  define 

-  Cf  =  _  I  4-  •  •  •  .4-0  and 

-  Ci  = 

Theorem  14.  For  all  processes  P  and  Q,  P  g^  Q  iff  P  <3- 
Theorem  15.  For  all  processes  P  and  Q,  <3  iff  F  Q. 
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4.2  Guarantees  and  fair  testing 

Lemma  16.  Let  P  be  a  process,  O  be  an  observer  and  let  ^  ^  P  be  a  fresh 
action;  (1)  Pmust^  O  iff  P  |  0{^/w}\l,  and  (2)  P!£  iff  Pmust^  £.w. 

Theorem  17.  For  all  processes  P  and  Q,  P  0  iff  P  Q- 

Proof;  (4=)  We  prove  that  is  contained  in  £  ,  the  claimed  result  follows 
by  closing  under  contexts.  Suppose  that  P  an5  that  Pmust^  O;  let  ^  be  a 
fresh  action.  We  have: 

P  must^  O  implies  (Lemma  16(1)) 

P  I  0{^lw}  !  I  implies  (hypothesis  Pdi^Q,  with  C  =  -  \  0{^lw]) 

Q  I  0{^lw}  !  i  implies  (Lemma  16(1)) 

Q  must^  O 

(^>)  The  proof  is  similar  but  relies  on  Lemma  16(2).  □ 

4.3  Guarantees  and  convergence,  and  must  testing 

The  next  definition  introduces  two  special  contexts  to  be  used  in  the  proof  of 
Theorem  20. 

Definition  18.  Let  s  €  P*,  say  s  —  In  {n  >  and  B  C.  Let 
denote  a  function  which  maps  each  ^  G  P  to  a  single  fresh  c.  Fix  a  bijective 
correspondence  among  ^i,  . . . ,  £n  and  n  fresh  actions  ai,  . . . ,  q:„.  We  define 

-  Ci  =  .\Ql  where  =  c  and  Qi^'  =  LQ^  []c,  and 

-  Cl'®  =  (-  I  I  Ql  where  =  h.ai- ■  ■ -In-o-n,  Ql  =  0  and  Qi'^'  = 

ai.Ql'ljc. 

Lemma  19.  Let  s  e  C*,  B  C  and  c  be  a  fresh  action. 

a)  P  4-  s  iff  C|[P]  ;  iff  C|[P]  4.  c. 

b)  (Pa/ters)!B  iffC|’®[P]!c. 

Theorem  20.  For  all  processes  P  and  Q,  P  Q  iff  P^r^^  Q- 

Proof:  (=^)  From  the  definition,  it  is  easily  seen  that  is  contained  in 
(indeed  P !  c  iff  {P  after  e) !  {c}).  From  this  fact,  by  closing  under  contexts 
and  applying  Theorem  8,  the  thesis  follows. 

(4=)  Here,  we  show  that  is  contained  in  .  From  this  fact  and  The¬ 
orem  8,  the  thesis  follows.  Assume  that  P  and  that  P  J.  s,  for  some  s  ^  C* . 

We  have  to  show  that:  (a)  Q  J,  s  and  (b)  {P  after  s)  \B  implies  (Q  after  s) !  B, 
for  any  B  C.  As  to  part  (a),  from  Pis  and  Lemma  19(a),  it  follows  that 
C§[P]  i.  Obviously,  for  every  process  P,  Cl[R]  !c.  From  G3[P]  4^,  ^'^[P]  !c  and 
P it  follows  that  C^[Q]  4-.  By  applying  again  Lemma  19(a),  but  in  the 
reverse  direction,  we  obtain  Q  4^  s.  As  to  part  (b),  suppose  that  {P  after  s) !  B. 
From  this,  applying  Lemma  19(b),  it  follows  that  C4^[P]!c.  Moreover,  it  is 
easy  to  see  that  for  every  process  R  i  s  implies  Cl'^[R]  4-  From  Cl’^[P]  4-, 
C4’^[P]  !c  and  P^:^^  Q,  it  follows  that  Cl’^[Q]  !c.  By  applying  again  Lemma 
19(b),  but  in  the  reverse  direction,  we  obtain  (Q  after  s)  \B.  □ 
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4.4  Guarantees  and  convergence,  and  safe— must 

To  prove  full  abstraction  for  safe-must,  we  will  use  another  special  context. 
Again,  we  assume  that  c  €  is  always  fresh.  If  s  G  £*,  say  s  =  ^  0)? 

and  B  Cfin  we  define  the  context 

-  =  _  I  where  =  EteB  and  ’^l]c. 

The  proof  of  the  following  theorem  is  similar  to  that  of  Theorem  20,  but  relies 
on  the  context  instead  of  . 

Theorem  21.  For  all  processes  P  and  Q,  P  0  iff  Q- 

It  is  worthwhile  to  point  out  why  the  context  cannot  be  used  in  place  of 
the  context  to  prove  full  abstraction  for  the  must  preorder  (Theorem  20). 
Indeed,  Pis  does  not  imply  that  C^'^[P]  i  (for  instance  a.b.O  i  a  but 
t).  This  would  invalidate  the  proof  of  the  “if”  part  of  Theorem  20. 

5  Comparing  the  preorders 

Theorem  22.  For  all  processes  P  and  Q,  P  Q  implies  P  E^  Qi  but  not 
vice-versa. 

Proof:  Paralleling  the  proof  of  Theorem  20,  part  it  is  easy  to  show  that 
is  contained  in  <5  ,  from  which  the  result  will  follow  by  applying  Theorems 

^  ^  def 

20  and  8.  To  show  that  the  vice-versa  does  not  hold,  consider  P  =  a.b.Q  and 
Q  ='^  a.  It  is  easy  to  see  that  P  Q,  but  P  0  (just  consider  _  |  S).  □ 


Theorem  23. 


1  == 

C  = 

l  —  c 

^  IC  —  C 

2.  = 

C 

and  E  is 

—  L 

Proof: 

'^FT 

c 


c  _  _^c 


1.  The  result  follows  from  Theorems  14,  20,  21  and  22.  By  definition,  it  is 

easily  seen  that  included  in  .  The  inclusion  is  strict: 

but  0. 

2.  The  equality  -  E  derives  from  Theorems  17  and  11.  To  see  that 
neither  of  E  ^  E  and  ,^Ms  included  in  E^  (hence  in  E„„  consider 

the  processes  P  recX.{a.X\]a.b)  and  Q  recX.a.X.  Clearly,  P  E^  Q-> 
hence  P  E^  Q  and  However,  P  E^  0  (because  Pmust^  O  and 

Q  O,  when  O  recX\a.X\^.w)).  To  see  the  converse,  observe  that 

0  12,  but  0  ,  2^  J7,  hence  0  E  ^  and  0^0.  ^ 

The  mutual  relationships  among  the  pre-congruences  are  simpler  if  we  move 
to  strongly  convergent  processes.  We  say  that  a  process  P  is  strongly  convergent 
if  P  I  5  for  every  s  6  P* . 

Theorem  24.  For  strongly  convergent  processes,  it  holds  that: 


C 


C  = 


l—C 


=: 


=  C 


C  = 
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6  Conclusions 

We  have  proposed  three  basic  notions  of  process  observables,  that,  when  closed 
with  respect  to  the  contexts  of  a  CCS-like  language,  induce  five  pre-congruences 
that  have  been  proved  to  coincide  with  well-known  and/or  intuitive  behavioural 
relations. 

Notions  of  observables  in  the  same  spirit  as  ours  have  been  proposed  in  [13], 
[21],  [11],  [15],  [8]  and  [12]. 

In  [13],  it  is  shown  that  the  pre-congruence  induced  by  inclusion  of  maxi¬ 
mal  traces  coincides,  both  for  CCS  and  CSP,  with  the  must  pre-congruence  of 
[6];  another  characterization  is  given  by  only  considering  the  inclusion  of  the 
maximal  e~trace,  i.e.  a  sequence  of  invisible  moves  leading  to  a  divergent  state 
or  to  a  deadlocked  one.  The  strength  of  the  basic  observables  (maximal  traces 
are  definitely  more  inspective  than  our  guarantees  predicate)  prevents  from  cap¬ 
turing  different  notions  such  as  fair  testing,  and  hinders  the  role  played  by  the 
convergence  test,  which  is  somehow  included  in  that  for  maximality. 

In  [21],  two  Petri  nets  are  called  d- equivalent  if  they  both  can  reach  a  dead¬ 
locked  state  or  if  they  both  cannot  do  so.  Then  it  is  proved  that,  by  closing 
d-equivalence  with  respect  to  parallel  composition,  the  variant  of  failure  seman¬ 
tics  [5]  that  ignores  divergence  is  obtained. 

In  [11],  a  series  of  variants  of  the  testing  framework  is  proposed  and  results 
are  listed  showing  that,  by  changing  the  expressive  power  of  testers,  a  number 
of  equivalences  ranging  from  bisimulation  to  testing  can  be  captured.  One  of 
the  considered  family  of  observers  is  that  consisting  just  of  agents  of  the  form 
^.u).^,  that  somehow  resemble  our  predicates.  It  is  claimed  that  for  strongly 
convergent  processes  the  pre-congruence  induced  by  this  family  of  observers 
coincides  with  the  must  preorder  and  the  reader  is  referred  to  [13]  for  the  proof. 
However,  we  could  not  find  the  proof  in  Main’s  paper. 

Milner  and  Sangiorgi  [15]  define  an  equivalence  for  processes  based  on  ele¬ 
mentary  observables,  namely  the  possibility  for  a  process  to  synchronize  along  a 
specific  channel.  However,  they  permit  to  recursively  test  for  the  presence  of  this 
observable.  The  resulting  notion  of  observability  (called  barbed  bisimilarity), 
when  closed  under  parallel  composition,  yields  bisimulation-based  equivalences 
that  are  significantly  more  discriminating  than  ours. 

Ferreira  [8]  and  Laneve  [12]  deal  with  languages  significantly  different  from 
classical  process  algebras.  In  particular,  Ferreira  uses  a  predicate  which  resem¬ 
bles  very  much  the  conjunction  of  our  |  and  !  £  (based  on  production  of  values 
rather  than  on  communication  capabilities)  to  define  a  testing  preorder  for  Con¬ 
current  ML  [20];  this  seems  to  be  strongly  related  to  our  safe-must  preorder.  He 
also  conjectures  that  if  one  considers  pure  CCS  (and  observes  communication 
capabilities  instead  of  value  productions)  the  obtained  preorder  coincides  with 
the  must  pre-congruence  of  [6];  here  we  have  proved  this  conjecture.  Laneve  dis¬ 
cusses  the  impact  of  an  observables-based  testing  scenario  on  the  Join  Calculus, 
a  language  with  elaborate  synchronization  schemata  [9]. 
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Abstract.  Motivated  by  the  problem  of  efficient  routing  in  all-optical 
networks,  we  study  a  constrained  version  of  the  bipartite  edge  coloring 
problem.  We  show  that  if  the  edges  adjacent  to  a  pair  of  opposite  vertices 
of  an  L- regular  bipartite  graph  are  already  colored  with  cxL  different 
colors,  then  the  rest  of  the  edges  can  be  colored  using  at  most  (1  +  Q'/2)L 
colors.  We  also  show  that  this  bound  is  tight  by  constructing  instances 
in  which  (1  -f  aj2)L  colors  are  indeed  necessary.  We  also  obtain  tight 
bounds  on  the  number  of  colors  that  each  pair  of  opposite  vertices  can 
see. 

Using  the  above  results,  we  obtain  a  polynomial  time  greedy  algorithm 
that  assigns  proper  wavelengths  to  a  set  of  requests  of  maximum  load 
L  per  directed  fiber  link  on  a  directed  fiber  tree  using  at  most  5/3L 
wavelengths.  This  improves  previous  results  of  [9,  7,  6,  10]. 

We  also  obtain  that  no  greedy  algorithm  can  in  general  use  less  than 
5/3L  wavelengths  for  a  set  of  requests  of  load  T  in  a  directed  fiber  tree, 
and  thus  that  our  algorithm  is  optimal  in  the  class  of  greedy  algorithms 
which  includes  the  algorithms  presented  in  [9,  7,  6,  10]. 

1  Introduction 

In  this  paper,  we  study  a  constrained  version  of  the  well-known  problem  of  col¬ 
oring  the  edges  of  an  L-regular  bipartite  graph.  It  is  a  classical  result  from  graph 
theory  (see  e.g.  [3])  that  the  edges  of  an  T-regular  bipartite  graphs  can  be  colored 
using  exactly  L  colors  so  that  edges  that  share  an  endpoint  are  assigned  different 
colors.  We  call  such  edge  colorings  legal  colorings.  The  problem  does  not  have  any 
other  extra  constraint:  any  given  color  can  be  used  on  any  edge  provided  that 
no  other  adjacent  edge  is  colored  using  that  same  color.  Our  constrained  version 
of  the  bipartite  edge  coloring  problem  can  be  described  in  the  following  way. 


Partially  supported  by  Progetto  MURST  40%,  Algoritmi,  Modelli  di  Calcolo  e  Strut- 
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We  are  given  an  L-regiilar  bipartite  graph  G  —  ({t’l ,  •  •  • ,  ’^^n} >  ‘  j  E) 

along  with  a  partial  legal  coloring  of  its  edges  that  specifies  a  color  for  all  edges 
incident  to  vertices  and  ui.  We  denote  the  total  number  of  constraining  colors 
by  Q'L,  where  1  <  a  <  2.  We  want  to  color  the  remaining  edges  of  the  graph  so 
as  to  minimize  the  total  number  of  colors  used  and  the  number  of  colors  used  to 
color  the  edges  touching  a  pair  (ui,Vi)  of  opposite  vertices. 

Our  motivation  lies  in  the  field  of  WDM  (wavelength  division  multiplexing) 
routing  in  all-optical  networks.  Optics  is  emerging  as  a  key  technology  in  state- 
of-the-art  communication  networks.  A  single  optical  wavelength  supports  rates 
of  gigabits-per-second  (which  in  turn  support  multiple  channels  of  voice,  data, 
and  video  [5]  [8]).  Multiple  laser  beams  that  are  propagated  over  the  same  fiber 
on  distinct  optical  wavelengths  can  increase  this  capacity  much  further;  this  is 
achieved  through  WDM  (wavelength  division  multiplexing).  We  model  the  un¬ 
derlying  fiber  network  as  a  directed  graph.  Communication  requests  are  ordered 
transmitter-receiver  pairs  of  nodes.  WDM  technology  establishes  connectivity 
by  finding  transmitter-receiver  paths,  and  assigning  a  wavelength  to  each  path, 
so  that  no  two  paths  going  through  the  same  link  use  the  same  wavelength.  Op¬ 
tical  bandwidth  is  the  number  of  available  wavelengths.  Bandwidth  is  a  scarce 
resource:  state-of-the-art  technology  allows  for  no  more  than  30-40  optical  wave¬ 
lengths  in  the  laboratory,  less  than  half  as  many  in  manufacturing,  and  there  is 
no  anticipation  of  dramatic  progress  in  the  near  future  [11].  It  is  thus  important 
to  minimize  the  number  of  wavelengths  used  to  service  a  requested  communi¬ 
cation  pattern.  Variations  of  this  problem  have  been  studied  by  several  authors 
[12,  1,  9,  7,  6,  10,  2]. 

In  this  paper,  we  concentrate  on  tree  topologies  which  are  relevant  to  wide- 
area  networks.  In  particular  we  consider  directed  trees  where  each  edge  of  the 
tree  consists  of  two  opposite  directed  fiberlinks.  Directedness  accurately  reflects 
directed  optical  amplifiers  placed  on  the  fiber  as  well  as  asymmetries  of  the 
communication  requests.  Raghavan  and  Upfal  [9]  showed  that  routing  requests 
of  maximum  load  L  per  link  of  undirected  trees  can  be  satisfied  using  no  more 
than  3L/2  optical  wavelengths  and  their  arguments  extend  to  give  a  2L  bound 
for  the  directed  case.  Mihail  et  al.  [7]  were  the  first  to  address  the  directed  case. 
Their  main  result  is  a  15T/8  bound  for  directed  trees.  They  obtain  this  bound  by 
reducing  the  wavelength  assignment  problem  to  the  constrained  bipartite  edge 
coloring  problem  and  obtain  a  solution  specifically  for  the  case  a  =  3/2.  This  was 
improved  in  [6]  (and  independently  in  [10])  by  solving  optimally  the  constrained 
bipartite  edge  coloring  problem  for  the  value  a  —  3/2  and  yielding  a  bound  of 
7/4Z-  for  directed  trees. 


1,1  Summary  of  results 

Our  results  can  be  summarized  in  the  following  theorems.  We  first  present  our 
results  on  the  constrained  bipartite  edge  coloring  problem. 

Theorem  1.  There  exists  a  polynomial  time  algorithm,  that  properly  colors  the 
tm colored  edges  of  an  L-regular  bipartite  graph  constrained  by  aL  colors  using 
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at  most  (1  +  a/2)L  colors  and  so  that  each  pair  (vi,Ui)  of  opposite  vertices  sees 
no  more  than  max{a'L,  (1  +  Q'/4)L}  different  colors. 

The  next  lower  bound  states  that  the  above  result  is  in  general  tight. 

Theorem  2.  Por  each  1  <  o  <  2  and  for  each  L  >  0  there  exists  an  L-regular 
bipartite  graph  constrained  by  aL  colors  for  which  any  legal  coloring  of  the  re¬ 
maining  edges  requires  at  least  {1  a /2)L  total  colors  while  there  exists  a  pair 
of  opposite  vertices  that  sees  at  least  ma.x{aL,  (1  +  q/A)L}  different  colors. 

Next  we  present  our  results  for  wavelength  routing  on  directed  trees.  We 
express  our  results  in  terms  of  the  maximum  load  L  of  a  set  of  requests;  i.e., 
the  maximum  number  of  paths  between  transmitter  and  receiver  that  share  the 
same  directed  fiber  link.  The  proposed  algorithm  is  a  greedy  algorithm.  A  greedy 
algorithm  is  an  algorithm  that  considers  the  vertices  of  the  tree  one  at  a.  time  in 
a  DFS  manner  and,  while  at  vertex  v,  colors  (i.e.,  assigns  a  wavelength  to)  all  the 
requests  that  touch  vertex  v  (i.e.,  start  at,  end  at,  or  go  through  u)  that  are  still 
uncolored.  Once  a  request  has  been  colored,  a  greedy  algorithm  never  recolors  it. 
Greedy  algorithms  do  not  require  global  control  and  are  thus  amenable  of  being 
implemented  in  a  distributed  setting  without  a  “central  authority”  that  has 
knowledge  of  the  overall  request  pattern.  All  known  algorithms  for  the  problem 
of  wavelength  routing  on  directed  trees  are  indeed  greedy  algorithms  [7,  6,  10]. 

Theorems.  There  exists  a  greedy  polynomial  time  algorithm,  that  assigns  wave¬ 
lengths  to  a  set  of  requests  of  maximum  load  L  on  a  directed  tree  using  at  most 
5/3//  wavelengths. 

Our  next  theorem  shows  a  lower  bound  that  implies  that  no  greedy  algorithm 
can  in  general  beat  the  5/3L  barrier. 

Theorem  4.  For  each  L,  for  each  e  >  0  and  for  each  greedy  algorithm  G  there 
exists  a  tree  and  a  pattern  of  communication  requests  of  maximum  load  L  for 
■which  G  uses  at  least  (|  —  e)  L  wavelengths. 

Therefore  better  bounds  can  only  be  obtained  by  non  greedy  algorithms.  The 
only  known  general  lower  bound  is  5/41/  [10]. 

The  rest  of  our  paper  is  organized  as  follows. 

In  Section  2,  we  prove  Theorem  1  by  giving  an  algorithm  that  solves  the 
constrained  bipartite  edge  coloring  problem.  Next,  in  Section  3  we  explain  the 
reduction  of  the  wavelength  routing  problem  on  directed  trees  to  the  constrained 
bipartite  edge  coloring  problem.  This  reduction  proves  Theorem  3.  Finally,  in 
Section  4,  we  present  our  lower  bounds. 

2  The  algorithm  for  the  constrained  bipartite  edge 
coloring  problem 

In  this  section  we  present  our  algorithm  for  solving  the  constrained  bipartite 
edge  coloring  problem. 
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The  algorithm  receives  as  input  an  L-regular  bipartite  graph  G  =  ({Wq, 

{Xo,  •  •  • ,  Xn},  where  all  the  edges  incident  to  Wo  and  Xq  have  been  properly 
colored  using  aL  different  colors.  We  call  the  edges  that  are  colored  color- forced 
edges  and  a  pair  {Wi,  Xi)  of  opposite  vertices  a  line.  We  assume  without  loss  of 
generality  that  no  edge  connects  two  opposite  vertices.  If  a  color  appears  on  only 
one  color-forced  edge,  then  we  call  it  a  single  color.  If  it  appears  on  two  color- 
forced  edges,  we  call  it  a  double  color;  note  that  one  of  these  two  color-forced 
edges  has  to  be  incident  to  Wo  and  the  other  to  Xq.  We  denote  by  D  and  S  the 
number  of  double  and  single  colors,  respectively. 

Step  1:  Obtaining  perfect  matchings.  We  proceed  by  decomposing  the  bipartite 
graph  into  L  perfect  matchings  which  can  always  be  done  since  it  is  Z^-iegular. 
Each  such  matching  includes  exactly  two  color-forced  edges:  one  incident  to  Wo 
and  one  incident  to  Xq.  A  double  color  is  called  separated  if  its  two  color-forced 
edges  appear  in  different  matchings.  On  the  other  hand,  if  they  appear  in  the 
same  matching  then  the  color  is  said  to  be  preserved.  We  classify  the  matchings 
into  four  types:  TT,  PP,  SS,  ST,  based  on  their  corresponding  color-forced  edges. 
If  the  two  color-forced  edges  of  a  matching  are  colored  with  separated  colois, 
then  the  matching  is  of  type  TT.  If  the  two  color-forced  edges  are  colored  with 
the  same  preserved  color,  then  the  matching  is  of  type  PP.  If  the  two  color-forced 
edges  are  colored  with  two  single  colors,  then  the  matching  is  of  type  SS.  If  the 
two  color-forced  edges  are  colored  with  a  single  color  and  with  a  separated  color, 
then  the  matching  is  of  type  ST. 


Step  2:  Constructing  chains  and  cycles  of  matchings.  We  partition  the  matchings 
into  groups.  Each  such  group  is  either  a  chain  or  a  cycle  of  matchings.  A  chain 
of  matchings  is  a  sequence  Mo,  Mi,  •  •  • ,  M/_i  of  I  matchings  such  that 

1.  Mo  and  M/_i  are  matchings  of  type  ST; 

2.  Ml ,  •  •  • ,  M/_2  s^re  all  matchings  of  type  TT; 

3.  for  each  0  <  f  <  /  -  2,  matchings  Mi  and  M^+i  share  exactly  one  double 
(separated)  color.  A  chain  consists  of  at  least  two  matchings. 

A  cycle  of  matchings  is  a  sequence  (Mo,  Mi ,  ♦  •  • ,  M^-i)  of  I  TT  matchings 
such  that,  for  each  0  ^  ^  ~  1,  matchings  Mi  and  Mj-j-i  modi  share  exactly 

one  double  (separated)  color. 

Step  3:  making  chains  and  cycles  minimal.  A  sequence  C  of  matchings  (chain 
or  cycle)  is  minimal  if  it  does  not  contain  any  two  parallel  color-forced  edges. 
A  non-minimal  sequence  of  matchings  can  be  split  into  two  shorter  sequences 
in  the  following  way.  Consider  the  sequence  C  =  (Mo,  -  ■  • ,  M/_i)  of  matchings 
and  suppose  that  the  edge  colored  a  of  M*  and  the  edge  colored  Cj  of  Mj  are 
parallel.  We  exchange  the  two  edges  thus  obtaining  two  new  matchings  M/  and 
M-  with  color-forced  edges  colored  cj  and  q+i  and  Cj  and  Cj+i  and  the  two 
new  sequences  of  matchings  Ci  =  (Mo,  Mi,  •  •  -Mj-i,  Mj,  Mj.fi,  •  •  M^-i)  and 

C2  =  (M/,Mi+i,  •  •  •,Mj_i).  The  sequence  Ci  is  of  the  same  type  (i.e.,  a  cycle 
or  a  chain)  as  C  while  C2  is  always  a  cycle.  We  repeat  this  process  of  splitting 
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one  sequence  into  two  new  sequences  until  all  sequences  are  minimal  (i.e.,  they 
do  not  contain  parallel  edges). 

Step  4'  constructing  triplets  of  matchings.  Next  we  partition  all  the  matchings 
into  groups  of  three  matchings  that  we  call  triplets.  Each  such  triplet  has  six 
color- forced  edges;  of  these,  two  are  colored  with  single  colors  and  the  remaining 
four  with  double  colors. 

We  obtain  the  triplets  as  follows.  First,  we  consider  all  the  chains  of  length  3 
or  greater.  From  each  such  chain  C  =  (Mq,  Mi,  •  •  •Mf_i)  we  obtain  one  triplet 
by  stripping  off  C  and  grouping  together  the  first  two  matchings  Mq,Mi  and 
the  last  matching  M;_i.  Triplets  obtained  in  this  way  will  consist  of  two  ST 
matchings  (that  is  Mq  and  M;_i)  and  one  TT  matching  (that  is  Mi).  The  color- 
forced  edges  are  colored  with  single  colors  sq  and  Si  and  double  colors  di,  ^2,  di-i, 
with  di  being  the  common  color  of  Mq  and  Mi.  Now  we  are  left  with  cycles, 
“stripped  chains,”  chains  of  length  2,  and  SS  matchings.  We  consider  the  even 
length  cycles  and  stripped  chains  first  and  construct  triplets  each  consisting  of 
two  consecutive  TT  matching  from  the  same  cycle  or  stripped  chain  and  one  SS 
matching.  We  repeat  the  same  process  for  odd  length  cycles  and  stripped  chains. 
However,  in  this  case  for  each  cycle  or  stripped  chain  there  will  be  exactly  one 
’’leftover”  TT  matching.  We  then  construct  triplets  with  one  SS  matching  along 
with  a  pair  of  these  TT  leftover  matchings.  Finally,  if  at  any  time  during  the 
construction  of  the  triplets  we  run  out  of  SS  matchings,  we  continue  constructing 
triplets  by  grouping  together  each  individual  TT  matching  along  with  a  pair 
of  ST  matchings  that  constitute  a  chain  of  length  2.  If  the  total  number  of  old 
colors  is  exactly  4/3L,  thus  including  exactly  2/3L  single  colors  and  2/3L  double 
colors  all  matchings  can  be  grouped  into  such  triplets.  Instead,  if  we  ha.ve  less 
than  4/3L  old  colors,  then  we  are  left  with  some  extra  TT  matchings  for  which 
there  is  no  corresponding  ST  or  SS  matching.  These  extra  TT  matchings  will 
be  dealt  with  separately  and  we  omit  from  this  abstract  further  details.  On  the 
other  hand,  if  the  number  of  old  colors  exceeds  4/3T,  then  we  are  left  with  extra 
SS  or  ST  matchings  for  which  no  corresponding  TT  matching  exists.  Coloring 
these  matchings  is  trivial  since  we  can  use  no  new  color  (we  use  the  single  colors 
to  color  the  uncolored  edges)  and  thus  meet  the  two  conditions  presented  below. 

We  will  color  the  matchings  maintaining  the  following  two  conditions  which 
are  sufficient  to  prove  Theorem  1. 

Condition  1.  The  number  of  new  colors  used  is  at  most  D/2.  This  condition 
will  be  enforced  by  using  at  most  one  new  color  per  triplet. 

Condition  2.  Each  line  sees  at  most  max{(l  -f  a/4)L,aL}  colors. 

For  values  of  a  >  4/3  this  is  enforced  by  making  sure  that  if  a  line  sees  a  new 
color  it  does  not  see  one  of  the  old  colors.  Consequently,  the  number  of  colors 
seen  by  a  line  does  not  exceed  aL  once  all  edges  have  been  colored. 

Lemma  5.  Condition  1  above  implies  that  the  total  number  of  colors  used  is  at 
most  (1  +  a/2)L. 

Proof.  Since  the  number  of  edges  adjacent  to  Wq  and  Xq  is  2L,  we  have  2D-\-S  = 
2L  and,  since  qL  colors  are  used  to  color  these  edges,  we  have  that  D  S  =  aL. 
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From  these  two  equalities  we  get  directly  that  D  =  (2  -  a) L.  Therefore  the  total 
number  of  colors  used  is  at  most  D  +  5  +  D/2  <  aL {2  -  a)L/2  —  {l-\-a/2)L. 

Step  5:  setting  the  active  colors.  We  color  each  triplet  individually  using  four 
of  the  old  colors  that  appear  on  the  color-forced  edges  of  the  triplet  and,  m 
some  cases,  a  new  color.  The  four  old  colors  used  are  called  the  active  colors  for 
the  triplet  and  they  include  the  two  single  colors  of  the  triplet.  The  remaining 
two  active  colors  are  chosen  among  the  double  colors  of  the  triplet  so  that  each 
double  color  is  active  for  exactly  one  triplet. 

We  continue  by  determining  what  the  active  colors  are  going  to  be  for  each 
triplet.  We  have  to  be  careful  about  consistency  among  triplets  that  share  double 
colors;  i.e.,  include  TT  matchings  from  the  same  cycle  or  chain. 

First  we  fix  the  active  colors  of  the  triplets  containing  the  leftover  TT  match¬ 
ings.  In  order  to  properly  color  such  a  triplet  (S', Ti,T2)  while  maintaining  the 
properties  above  we  choose  the  active  colors  to  be  the  two  single  colors  of  match¬ 
ing  S  along  with  the  color  of  the  color-forced  edge  touching  Wo  in  Ti  and  the 
color  of  the  color-forced  edge  touching  Xq  in  T2. 

This  choice  of  active  colors  for  such  a  triplet  forces  the  choice  of  active  colors 
for  the  triplets  containing  TT  matchings  coming  from  the  same  cycle  or  chain 
as  Ti  and  T2  in  the  following  obvious  way.  Let  {S,T3,T4)  be  a  triplet  consisting 
of  one  SS  matching  and  two  consecutive  TT  matchings  from  the  same  cycle  or 
stripped  chain  as  Ti.  Then  the  active  colors  of  such  a  triplet  are  the  colors  of 
the  color-forced  edges  touching  Wo  in  T3  and  T4  along  with  the  two  colors  of 
the  color-forced  edges  of  If,  instead,  T3  and  T4  belong  to  the  same  cycle  or 
stripped  chain  as  T2,  then  the  active  colors  are  going  to  be  the  old  single  colors 
appearing  in  Si  along  with  the  color  of  the  color-forced  edges  of  T3  and  T4  that 
touch  A"o. 

Finally,  we  can  determine  the  active  colors  of  the  triplets  containing  two  TT 
matchings  belonging  to  even  length  cycles  or  chains  (i.e.,  those  cycles  or  chains 
that  did  not  give  rise  to  leftover  TT  matchings)  to  be  for  each  triplet  the  two 
old  single  colors  of  the  triplet  along  with  the  color  of  the  color-forced  edges  that 
touch  A'o  or  Wq,  picked  arbitrarily  as  long  as  we  are  consistent  across  each  cycle 
or  chain. 

Step  6:  coloring  the  triplets.  As  we  mentioned  above  for  each  triplet  we  will 
use  the  active  colors  of  the  triplet  and,  sometimes,  a  new  color.  If  we  do  use  a 
new  color  for  a  triplet,  we  enforce  the  property  that  each  line  that  sees  the  new 
color  does  not  see  one  of  the  active  colors  of  the  triplet.  This  ensures  that  the 
total  number  of  colors  that  a  line  will  see  across  all  triplets  does  not  exceed  the 
number  of  old  colors  and  that  the  total  number  of  new  colors  introduced  for  all 
triplets  is  at  most  half  the  number  of  old  double  colors. 

There  are  four  general  types  of  triplets: 

Type  A  These  are  triplets  consisting  of  one  SS  matching  and  two  leftover  TT 
matchings.  A  special  case  of  type  A  triplet  occurs  when  one  or  both  the 
leftover  TT  matchings  is  actually  a  PP  matching  that  is  the  leftover  matching 
of  a  cycle  of  length  1. 
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Type  B  These  are  triplets  consisting  of  one  SS  matching  and  two  consecutive 
TT  matchings  from  the  same  cycle  or  chain.  A  special  case  of  type  B  triplet 
occurs  when  the  two  TT  matchings  constitute  a  cycle  of  length  2. 

Type  C  These  are  triplets  consisting  of  the  two  ST  matchings  that  constitute 
a  chain  of  length  2  and  one  TT  matching.  A  special  case  of  type  C  triplet 
occurs  when  the  TT  matchings  is  actually  a  PP  matching. 

Type  D  These  are  triplets  that  were  obtained  by  stripping  off  a  chain  the  first 
two  matchings  (an  ST  and  a  TT  matching)  and  the  last  matching  (an  ST 
matching).  A  special  case  of  type  D  triplet  occurs  when  the  chain  has  length 
exactly  3. 

Due  to  lack  of  space  we  next  show  the  coloring  algorithm  only  for  triplets  of 
type  A.  The  complete  coloring  appears  in  the  final  version. 

Step  6. A:  coloring  triplets  of  type  A.  Consider  a  triplet  R  =  (5, Ti,r2)  of  type 
A,  where  S  =  (si,S2),  Ti  =  {x,y)  and  T2  =  [w,z).  We  note  that  si  and  §2  are 
single  colors  and  that  x,y,w,  and  2:  are  double  colors  and  let  the  active  colors  of 
R.  be  si,  S2,  X,  and  z.  Here  we  concentrate  on  the  case  in  which  the  four  double 
colors  are  distinct  separated  colors.  If  x  =  y  or  w  =  z,  then  the  corresponding 
TT  matching  is  actually  a  PP  matching  and  the  coloring  is  much  simpler  than 
what  we  are  going  to  describe  below,  li  x  =  z  or  y  =  w,  then  R  is  actually  a 
triplet  of  type  B. 

Suppose  x,y,w,  and  z  are  distinct  separated  double  colors.  We  consider 
matchings  Ti  and  T2  together  as  one  cycle  cover  of  the  bipartite  graph.  In  what 
follows,  for  the  sake  of  clarity  we  assume  that  the  cycle  cover  of  two  matchings 
consists  of  one  single  cycle  that  spans  the  entire  bipartite  graph.  We  remark  that 
all  our  colorings  can  be  easily  adapted  if  such  a  cycle  cover  consists  of  more  than 
one  cycle. 

We  first  check  if  there  exists  an  uncolored  edge  whose  endpoints  are  incident 
to  color-forced  edges  colored  with  all  four  active  colors.  Note  that  these  may 
include  the  “fixed”  color-forced  edges  colored  with  Si,S2,x,  and  z  that  belong  to 
R.  as  well  as  the  two  “free”  color-forced  edges  colored  with  x  and  z  that  belong 
to  other  triplets.  We  denote  by  and  the  free  color-forced  edge  colored  with 
X  and  z,  respectively  and  by  and  eg^  the  color-forced  edges  colored  with  si 
and  s-2,  respectively. 

Suppose  there  is  no  edge  restricted  by  all  four  active  colors.  We  color  the 
uncolored  edges  of  the  cycle  cover  by  starting  from  one  of  the  color-forced  edges 
of  the  cycle  colored  with  an  active  color  (i.e.,  either  x  or  z)  and  alternating 
between  x  and  z.  When  we  encounter  a  vertex  v  that  is  incident  to  a  free  color- 
forced  edge  e,  we  use  color  S2  to  color  the  edge,  e',  incident  to  v  that  would 
have  been  colored  with  the  same  color  as  e.  Then  we  color  the  next  edge  x  and 
continue  alternating  between  z  and  x.  This  is  possible  unless  e'  is  adjacent  to 
as  well.  Note  that  cannot  be  incident  to  the  same  vertex  as  Cg^,  and,  similarly, 
Cx  cannot  be  incident  to  the  same  vertex  as 

Now  if  e'  is  restricted  by  both  z  and  S2,  then  we  color  with  S2  the  other  edge 
incident  to  v,  color  e'  with  x  and  continue  alternating  z  and  x  (see  Figure  1); 
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we  finish  by  using  si  to  color  the  edges  in  the  SS  matching.  This  coloring  is 
obviously  proper  and  we  do  not  need  to  argue  about  the  number  of  colors  seen 
by  a  line  since  we  have  used  no  new  color. 


s_2  \ 


Fig.  1.  The  case  in  which  an  edge  is  restricted  by  both  2  and  S2- 


On  the  other  hand,  if  and  are  incident  to  the  same  vertex  v  then  we 
color  the  conflicting  adjacent  edge  e  =  with  si  and  continue  alternating  x 

and  z  starting  with  x.  The  uncolored  edges  of  the  SS  matching  are  then  colored 
using  S'l  except  for  the  edge  incident  to  u.  Edge  is  colored  S2 

unless  u*  sees  the  edge  colored  S2  used  to  fix  the  conflict  with  in  which  case 
Cu  is  colored  with  2:  (see  Figure  2). 


WJ)  x_o 

- 0^-0 


Fig.  2.  The  case  in  which  an  edge  is  restricted  by  both  x  and  S2. 


The  previous  coloring  is  proper  unless  e  is  also  adjacent  to  in  which  case 
a  more  complex  coloring  is  performed. 

Finally  we  consider  the  situation  where  we  have  an  edge  (w,  t;)  restricted  by 
all  four  active  colors.  Note  that  such  a  restricted  edge  belongs  to  one  of  the  TT 
matchings  of  the  triplet,  as  edges  of  the  SS  matching  cannot  be  restricted  by  si 
or  S2.  We  color  edge  with  n,  the  rest  of  the  uncolored  edges  of  the  cycle 

cover  by  alternating  x  and  2:,  and  the  uncolored  edges  of  the  SS  matching  with 
n.  This  coloring  is  obviously  proper.  No  line  except  the  lines  containing  vertices 
ii  and  V  sees  an  edge  colored  with  one  of  the  two  single  colors.  Moreover,  since 
u  and  X)  cannot  be  a  line  as  they  are  adjacent,  the  line  containing  u  does  not  see 
color  S2  and  the  line  containing  v  does  not  see  color  si .  Therefore,  if  a  line  sees 
n  then  it  does  not  see  at  least  one  of  the  active  colors. 
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2.1  An  alternative  coloring  approach  for  a  =  4/3 

In  this  section,  we  briefly  describe  an  alternative  method  for  coloring  edges  of  a 
bipartite  graph  G  for  the  case  a  =  4/3. 

It  is  possible  to  show  that  the  L  perfect  matchings  obtained  from  G  can  be 
grouped  entirely  into  triplets  such  that  each  triplet  can  be  colored  with  at  most 
one  new  color  and  with  <  4  colors  per  line,  thus  ensuring  Conditions  1  and  2. 
Due  to  a  result  in  [10],  every  triplet  t  ~  (Mi,  M2,  M3)  with  two  double  colors 
and  one  single  color  incident  to  each  of  144)  ^0  can  be  colored  in  such  a  way, 

provided  that  at  least  one  double  color  d  appears  twice  in  t  (call  such  a  triplet 
a  KS-triplet)  and  that  t  can  be  partitioned  into  a  gadget  (a  subgraph  where  14^o 
and  A'o  have  degree  3  and  all  other  vertices  have  degree  2)  and  a  matching  of 
all  vertices  except  {Wo^Xq}.  Only  the  two  single  colors  and  the  double  color  d 
oft  as  well  as  one  new  color  are  used.  Every  KS-triplet  can  be  partitioned  into 
gadget  and  matching  unless  it  contains  a  PP-matching.  Therefore,  we  assume 
that  G  is  decomposed  into  L  perfect  matchings  such  that  the  union  of  SS-,  ST-, 
and  TT-matchings  does  not  contain  further  PP-matchings. 

A  PP-matching  and  a  chain  of  length  2  as  well  as  two  PP-matchings  and  an 
SS-matching  give  triplets  that  can  be  colored  without  any  new  color  and  4  colors 
per  line.  Chains  of  odd  length  and  cycles  of  even  length  yield  triplets  of  Type  B, 
C,  and  D,  which  are  KS-triplets.  Two  chains  of  even  length,  one  of  which  has 
length  >  2,  yield  KS-triplets  by  combining  the  first  (last)  two  matchings  of  the 
longer  chain  with  the  first  (last)  matching  of  the  shorter  chain  and  producing 
triplets  of  Type  B  or  C  from  the  rest.  If  there  is  a  chain  of  length  2,  a  cycle 
of  odd  length  also  yields  triplets  of  Type  B  and  C.  Note  that  there  is  always  a 
sufflcient  number  of  SS-matchings  or  chains  of  length  2  to  produce  KS-triplets, 
because  we  have  2/3 A  edges  with  single  colors  and  4/3i>  edges  with  double  colors 
altogether. 

After  these  reductions,  we  are  left  with  at  most  one  chain  of  even  length  >  2, 
at  most  one  PP-matching,  a  number  of  cycles  of  odd  length,  and  SS-matchings. 
Two  cycles  of  odd  length  are  handled  by  choosing  an  SS-matching  and  two  TT- 
matchings,  one  from  each  cycle,  such  that  the  resulting  triplet  t  does  not  have 
parallel  color-forced  edges.  If  t  can  be  partitioned  into  gadget  and  matching, 
it  is  colored  with  reused  old  colors  and  one  new  color  using  techniques  similar 
to  [10],  and  triplets  of  Type  B  are  produced  from  the  remainder  of  the  two  cycles. 
Otherwise,  the  TT-matchings  can  be  reassembled,  turning  the  two  given  cycles 
into  a  single  cycle  of  even  length,  which  is  handled  as  above.  A  chain  of  even 
length  >  2  and  a  cycle  of  odd  length  are  combined  similarly. 

For  a  PP-matching  and  a  cycle  of  odd  length,  we  choose  an  arbitrary  SS- 
matching  Ml  and  a  TT-matching  M2  from  the  cycle  such  that  the  cycle  cover 
Ml  U  M2  does  not  contain  parallel  color-forced  edges.  This  cycle  cover  can  be 
colored  with  one  new  color  and  one  of  its  single  colors  such  that  no  line  sees  more 
than  3  colors.  The  PP-matching  M3  is  colored  using  its  preserved  double  color, 
thus  ensuring  that  the  coloring  for  t  =  (Mi,  M2,  M3)  meets  the  requirements. 
The  remainder  of  the  cycle  is  combined  with  SS-matchings  into  Type  B  triplets. 
A  PP-matching  and  a  chain  of  even  length  >  2  are  handled  similarly. 
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3  Reducing  the  routing  problem  to  a  constrained 
bipartite  coloring  problem 

In  this  section  we  reduce  the  problem  of  assigning  wavelengths  to  the  constrained 
bipartite  edge  coloring  problem.  We  do  so  by  giving  an  algorithm  that  properly 
assigns  wavelengths  by  using  as  a  subroutine  our  algorithm  for  the  constrained 
bipartite  edge  coloring  problem  of  the  previous  section. 

Our  algorithm  for  assigning  wavelengths  is  a  greedy  algorithm  as  the  ones 
presented  in  [7,  6,  10].  The  algorithm  roots  the  tree  at  an  arbitrary  node  and  com¬ 
putes  a  depth-first  numbering  of  the  nodes  of  the  tree.  The  algorithm  proceeds 
in  phases,  one  per  each  node  v  of  the  tree.  The  nodes  are  considered  following 
their  depth  first  numbering.  The  phase  associated  with  node  v  assumes  that  a 
partial  proper  coloring  of  all  paths  that  touch  (i.e.,  start,  end,  or  go  through) 
nodes  with  numbers  strictly  smaller  than  u’s  has  been  computed  and  extends 
the  partial  coloring  to  one  that  assigns  proper  colors  to  all  paths  that  touch  v 
but  have  not  been  colored  yet.  We  stress  that  the  algorithm  never  recolors  paths 
that  have  been  colored  in  previous  phases. 

We  now  show  the  reduction  of  the  path  coloring  problem  of  a  phase  associated 
with  node  v  to  an  instance  of  the  constrained  bipartite  edge  coloring  of  a  graph 
Gy.  Without  loss  of  generality,  we  assume  to  have  full  load  L  on  each  directed 
link  and  denote  by  cq  v's  parent  and  by  ci,  •  •  • ,  the  children  of  v.  We  construct 
Gy  in  the  following  way.  For  each  vertex  Cf,  Gy  has  four  vertices  Wi,  Xi^Yi,  Zi 
and  the  left  and  right  partitions  are  {Wi,  Zi\i  =  ^ ^ '  k}  and  {Xi,Yi\i  —  0,  •  •  - 
Gy  has  an  edge  from  Wj  to  Xj ,  for  each  path  of  the  tree  directed  out  of  ci  into 
Cj  and  an  edge  from  Wi  to  Yi,  for  each  path  from  ci  to  v.  Finally,  for  each  path 
from  V  to  Cj,  Gy  has  an  edge  from  Zi  to  Xj.  See  Figure  3.  The  above  edges  are 
called  real.  Notice  that  no  real  edge  extends  across  opposite  vertices  Zj  and  Yi 
or  Wi  and  Xi  and  only  edges  with  an  endpoint  in  Wo  or  Xq  already  have  a  color 
as  they  correspond  to  requests  touching  u’s  parent  and  have  been  assigned  a 
color  in  a  previous  phase.  Notice  also  that  all  vertices  of  type  Wi  and  Xj  have 
degree  L  whereas  vertices  of  type  Zi  and  Yj  do  not  necessarily  have  degree  L. 
We  therefore  add  fictitious  edges  to  the  bipartite  graph  so  that  all  vertices  have 
degree  L.  Clearly,  any  proper  coloring  of  the  edges  of  Gy  corresponds  to  a  legal 
assignment  of  wavelengths  to  requests  that  go  through  vertex  v  and  we  compute 
such  a  coloring  of  the  edges  of  Gy  by  running  the  algorithm  of  the  previous 
section  on  Gy. 

4  Lower  bound 

In  this  section  we  present  our  lower  bounds  for  the  wavelength  routing  problem 
by  showing  that  any  greedy  algorithm  for  assigning  paths  to  requests  of  load  L 
on  a  tree  cannot  use  less  than  5/3L  colors  even  if  the  tree  is  binary.  The  lower 
bound  for  the  constrained  bipartite  edge  coloring  is  obtained  similarly. 

We  prove  the  lower  bound  inductively.  We  assume  inductively  that,  for  a 
vertex  C  there  are  Oinj^L  requests  along  each  link  to  its  parent  and  that  all  of 
these  requests  are  colored  using  different  colors. 
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Fig.  3.  Requests  touching  vertex  v  and  the  corresponding  bipartite  graph  (only  real 
edges  are  shown). 


Then  we  assign  requests  between  the  two  children  A  and  5  of  C  in  such  a 
way  that  (1  +  ^)1  colors  are  used  in  total  and  the  inductive  hypothesis  between 
one  of  T  and  B  and  one  of  its  children  is  enforced  for  =  1  +  ^.  It  is  easy 
to  see  that  a*  =  limn^.oo  =  4/3,  where  ai  =  1  and  =  1  +  for  n  >  1. 
Therefore,  for  any  6  >  0  and  any  greedy  algorithm  G,  it  is  possible  to  construct 
a  set  of  communication  requests  of  maximum  load  L  so  that  G  uses  at  least 
(5/3  —  €)L  colors. 

The  base  of  our  induction  for  Oi  =  1  is  established  in  the  following  way. 
We  start  with  L  requests  on  each  direction  between  the  root  R  and  one  of  its 
children  The  greedy  algorithm  colors  these  request  using  at  least  L  colors 
in  each  direction.  We  then  choose  two  sets  of  L/2  request  along  each  direction 
with  each  request  colored  with  a  different  color,  propagate  them  to  one  of  the 
children  of  R!  and  stop  at  R'  the  remaining  requests. 

Let  C  be  a  vertex  and  A  and  B  the  two  children  of  its  left  child.  We  denote  by 
/v2  the  set  of  colors  used  along  the  link  (G,  A),  by  K3  the  set  of  colors  used  along 
the  link  {B,  C),  by  Ki  the  set  of  colors  used  along  the  link  (5,  A)  and  by  K4  the 
set  of  colors  used  along  the  link  {A,  B)  .  We  inductively  assume  that  K2nK3  —  0 
and  1/^21  =  —  Q;/2L  whence  thus  \K2  U  A'sl  =  We  fill  the  link  {B,  A)  to 

capacity  by  assigning  ki  —  L(l  — q;/2)  requests.  These  requests  need  to  be  colored 
with  new  colors  and  thus  the  total  number  of  colors  used  increases  to  L(1  +  f ). 
Next  we  assign  L  requests  to  the  link  (A,  B).  The  best  that  any  greedy  algorithm 
can  do  is  to  color  these  L  requests  colored  using  all  the  new  colors  employed  for 
the  link  {B,A),  plus  half  of  the  colors  of  K3  and  half  of  the  colors  of  K2-  The 
edge  (T,  G)  thus  sees  | Ai  UA"2U/\4l  =  |Ai|  +  |A2|  +  lA'’4|  — |A2nA4|  — |A’’inA4|  = 
(1  In  order  to  complete  the  inductive  step  we  have  to  enforce  for  A  the 

same  situation  as  in  C  for  (1  +  f ).  This  is  achieved  in  the  following  way: 

1.  among  the  | Ad  |  +  [Adi  =  L  requests  coming  from  C,  we  let  only  the  following 
continue  to  the  left  child  of  A: 

-  Si:  (I  —  f )  T  requests  from  Ad- 


504 


-  S2'.  requests  from  K2- 
for  a  total  of  ^  (l  +  f )  colors. 

2.  and  the  |A"4|  =  L  requests  coming  up  from  yl  to  C  all  originate  from  A 
except  for  the  following  ones  which  instead  come  from  the  right  child  of  A 

-  Ri:  jL  requests  that  were  colored  with  colors  used  of  K2  and  which 
were  not  considered  in  S2  above; 

-  R2:  tL  requests  that  were  colored  with  colors  used  of  K3; 

-  R3:  (I  -  |a)  L  that  were  colored  with  colors  of  Ki  and  which  were  not 
considered  in  Si  above; 

for  a  total  of  ^  (l  +  f )  colors. 

Finally,  observe  that  the  requests  going  down  to  the  left  child  of  A  and  those 
coming  up  from  the  right  child  of  A  are  colored  with  different  colors  (i.e.  the 
sets  of  colors  are  disjoint).  This  completes  the  proof  of  Theorem  4. 
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Abstract.  Let  T  be  a  symmetric  directed  tree,  i.e.,  an  undirected  tree 
with  each  edge  viewed  as  two  opposite  arcs.  We  prove  that  the  minimum 
number  of  colours  needed  to  colour  the  set  of  all  directed  paths  in  T,  so 
that  no  two  paths  of  the  same  colour  use  the  same  arc  of  T,  is  equal  to 
the  maximum  number  of  paths  passing  through  an  arc  of  T.  This  result 
is  applied  to  solve  the  all-to-all  communication  problem  in  wavelength- 
division-multiplexing  (WDM)  routing  in  all-optical  networks,  that  is, 
we  give  an  efficient  algorithm  to  optimally  assign  wavelengths  to  the  all 
the  paths  of  a  tree  network.  It  is  known  that  the  problem  of  colouring  a 
general  subset  of  all  possible  paths  in  a  symmetric  directed  tree  is  an  NP- 
hard  problem.  We  study  conditions  for  a  given  set  S  of  paths  be  coloured 
efficiently  with  the  minimum  possible  number  of  colours/ wavelengths. 


1  Introduction 

Let  T  be  a  tree  and  x,y  two  vertices  of  T.  The  dipath  P{x,y)  in  T  is  the 
undirected  path  joining  x  to  y,  in  which  each  edge  is  considered  traversed  in 
the  direction  from  x-  to  y.  In  other  words,  the  dipaths  P{x,y)  and  P(y,x)  are 
different  and  do  not  traverse  any  edge  in  the  same  direction.  We  are  interested 
in  colouring  the  set  of  dipaths  P{x,  y),  for  all  ordered  pairs  x,y  oi  vertices  of  T, 
in  such  a  way  that  two  dipaths  using  the  same  edge  of  T  in  the  same  direction 
obtain  different  colours.  Let  c(T)  denote  the  minimum  number  of  colours  in  such 
a  colouring  of  the  dipaths  of  T.  Let  7r(T)  denote  the  maximum  number  of  dipaths 
P(x,  y)  which  all  pass  through  the  same  edge  of  T  in  the  same  direction.  Clearly 
7r(T)  <  c(T)  for  every  tree  T.  It  has  been  conjectured  by  Bermond  et  ai  [7]  that 
in  fact  7r(T)  =  c{T)  holds  for  every  T.  Here  we  prove  this  conjecture. 

Moreover,  given  a  subset  S  of  all  the  paths  on  a  tree  T",  we  consider  conditions 
for  the  existence  of  an  efficient  algorithm  to  colour  all  the  paths  in  S  with  the 
minimum  possible  number  of  colours;  this  problem  is  NP-hard  in  general. 

*  Work  partially  supported  by  the  Italian  Ministry  of  the  University  and  of  the  Sci¬ 
entific  Research  in  the  framework  of  the  project:  “Efficienza  di  Algoritmi  e  Progetto 
di  Strutture  Informative”  and  by  Galileo  Project. 
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1.1  Motivations  and  Related  Work 

The  problem  originally  arose  in  the  context  of  all-optical  networks.  Optical  net¬ 
works  are  emerging  as  key  technology  in  communication  networks  and  are  ex¬ 
pected  to  dominate  many  applications,  such  as  video  conferencing,  scientific 
visualisation,  real-time  medical  imaging,  high-speed  super-computing  and  dis¬ 
tributed  computing  [17,  25,  29].  The  books  of  Green  [17]  and  McAulay  [22] 
offer  a  comprehensive  overview  of  the  physical  theory  and  applications  of  this 
emerging  technology.  All-optical  networks  exploit  photonic  technology  for  the 
implementation  of  both  switching  and  transmission  functions  [16],  and  main¬ 
tain  the  signal  in  optical  form  through  the  transmission,  thus  allowing  for  much 
higher  data  transmission  rates  (since  there  is  no  prohibitive  overhead  due  to 
conversions  to  and  from  the  electronic  form).  Wavelength-division  multiplexing 
(WDM)  [10]  partitions  the  optical  bandwidth  into  a  number  of  channels,  and  al¬ 
lows  multiple  data  streams  to  be  transferred  concurrently  along  the  same  optical 
fiber,  on  different  channels,  i.e.,  different  wavelengths.  The  same  wavelength  on 
two  input  ports  of  a  switch  cannot  be  routed  to  a  same  output  port,  due  to  elec¬ 
tromagnetic  interference.  There  are  various  switches  considered  in  the  literature, 
with  ‘generalized  switches’  being  one  of  the  more  common  variants,  [1,  2,  27]. 
The.se  switches  allow  different  signals  to  travel  on  the  same  communication  link 
into  the  switch  (on  different  wavelengths),  and  then  exit  from  it  along  different 
links. 

All-optical  networks  are  networks  where  the  information,  once  transmitted  as 
light,  reaches  its  final  destination  directly  without  being  converted  to  electronic 
form  in  between.  Maintaining  the  signal  in  optic  form  allows  to  reach  high  speed 
in  these  networks  since  there  is  no  overhead  due  to  conversions  to  and  from  the 
electronic  form.  Such  an  approach  allows  thus  the  elimination  of  the  “electronic 
bottleneck”  of  communications  networks  with  electronic  switching. 

In  an  all-optical  network  one  needs  to  set  up  a  number  of  communications 
(paths)  between  given  pairs  of  nodes,  with  each  path  being  transmitted  on  one 
particular  wavelength,  and  all  paths  sharing  a  link  having  different  wavelengths. 
Specifically,  one  is  given  a  set  of  requests  (ui,  6i),  (03,  62), . . . ,  (a/c,  6/j),  and  is 
required  to  connect  each  a,  to  the  corresponding  bi  by  a  path  Pi  and  assign 
wavelengths  to  each  path  Pi  so  that  paths  of  the  same  wavelength  do  not  share 
a  link.  Viewed  in  this  light,  the  problem  has  initially  been  treated  in  the  context 
of  undirected  graphs,  [2,  1,  27].  However,  it  has  recently  become  clear  that  each 
bidirectional  optical  link  will  actually  consist  of  a  pair  of  unidirectional  links  [25], 
and  hence  the  new  models  of  the  situation  tend  to  represent  the  network  by  a 
symmetric  directed  graph,  or  equivalently,  view  each  path  as  a  dipath  (as  above) 
[7,  23,  18].  We  study  the  situation  in  the  case  of  trees.  The  interest  in  trees  is  due 
to  the  fact  tree-like  networks  are  standard  in  the  telecommunications  industry 
[23].  Furthermore,  trees  free  us  from  one  half  of  the  problem  -  that  of  choosing 
the  actual  paths  for  connecting  the  required  nodes  (since  in  a  tree  these  paths 
are  unique).  The  minimum  number  of  wavelengths  corresponds  to  the  minimum 
number  of  colours  in  a  colouring  of  dipaths  as  detailed  above.  This  parameter 
is  considered  of  importance  in  evaluating  the  competitiveness  of  the  wavelength 
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division  multiplexing  technology  [23].  i-  x 

Thus  this  general  problem  becomes  one  of  colouring  a  given  set  (or  multiset) 
S  of  dipaths  in  a  tree  T  with  the  minimum  number  of  colours,  so  that  dipaths 
using  one  edge  in  the  same  direction  obtain  different  colours.  We  find  the  above 
terminology  convenient  to  work  with.  However,  it  should  be  clear  to  the  reader 
that  an  equivalent  formulation  would  consider  T  to  be  a  symmetric  directed 
tree  -  by  replacing  each  edge  of  T  with  the  two  opposite  arcs  (optical  links) 
corresponding  to  it  -  and  then  each  dipath  would  simply  become  a  directed  path 
in  the  usual  sense  of  the  word.  Conditions  on  using  an  edge  in  one  direction  then 
simply  translate  into  conditions  on  using  an  arc. 

We  call  proper  a  colouring  of  5  a  colouring  such  that  dipaths  of  S  using 
one  edge  in  the  same  direction  obtain  different  colours.  The  minimum  number 
of  colours  in  a  proper  colouring  of  a  set  (multiset)  S  of  dipaths  in  a  tree  T 
will  be  denoted  by  cs(T),  and  the  maximum  number  of  dipaths  from  S  that 
pass  through  one  edge  of  T  in  one  direction  by  ns(T).  We  clearly  must  have 

77s(T)  <  cs{T). 

The  problem  of  colouring  a  general  subset  of  all  possible  paths  in  a  symmetric 
directed  tree  is  an  NP-hard  problem  [12].  Approximation  algorithms  are  given 
in  [27,  23,  18,  19].  The  best  ratio  is  obtained  in  [18]  where  the  authors  provide 
an  algorithm  that  requires  at  most  ^/S7Ts(T)  colours,  for  any  set  S  of  paths  in 
a  symmetric  directed  tree  T.  A  recent  survey  including  this  topic  is  given  m  [6]. 


1.2  Our  Results 

In  Section  2  we  concentrate  on  the  problem  of  all-to-all  communication  (or  gos¬ 
siping’).  In  this  situation,  every  node  is  requesting  a  connection  with  every  other 
node.  All-to-all  communication  among  the  processors  is  one  of  the  most  impoi- 
tant  issues  in  multi-processor  systems.  The  need  for  this  kind  of  communication 
arises  in  many  problems  of  parallel  and  distributed  computing  including  many 
scientific  computations  [8,  11,  13]  and  database  management  [30].  Due  to  the 
considerable  practical  relevance  in  parallel  and  distributed  computation  and  the 
related  interesting  theoretical  issues,  such  problems  have  been  extensively  stud¬ 
ied  in  the  literature  (see  the  surveys  [20,  21,  24,  6]).  First  studies  of  this  problem 
in  the  context  of  optical  networks,  can  be  found  in  [7,  5,  6] 

In  this  paper,  we  show  that  the  minimum  number  of  colours  necessary  to 
establish  all-to-all  connections  in  a  tree  is  equal  to  the  maximum  number  of 
intersecting  dipaths,  i.e.,  we  shall  prove  the  following  result. 

Theorem  1.  Let  T  be  a  tree.  Then  c(T)  -  7t{T). 

Above  Theorem  1  settles  a  conjecture  by  Bermond  et  al  [7]. 

We  stress  that  our  proof  also  represents  an  efficient  (e.g.,  polynomial)  algorithm 
for  the  actual  assignment  of  the  colours  to  the  paths. 

In  Section  3  we  study  conditions,  given  a  set  S  of  paths  on  a  tree  T,  for  the 
existence  of  an  efficient  algorithm  to  colour  the  paths  in  S  with  the  minimum 
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possible  number  of  colours.  We  recall  that  the  problem  of  optimal  colouring  of 
paths  is  NP-hard  [12].  We  show  that  cs{T)  =  tts{T)  for  each  set  of  paths  S 
if  and  only  if  T  is  a  generalized  star,  that  is,  a  tree  obtained  from  a  star  by 
replacing  each  edge  with  a  path. 

Moreover,  given  any  tree  T,  we  give  conditions  on  the  set  5  assuring  that  cs{T)  — 
7^s{T)  and  cs{T)  can  be  found  in  polynomial  time. 

Due  to  space  limitations  some  proofs  are  omitted  from  this  extended  abstract. 


2  Colouring  all  paths 

In  this  section  we  consider  the  problem  of  all-to-all  communication  (or  ‘gossip¬ 
ing’).  In  this  situation,  every  node  is  requesting  a  connection  with  every  other 
node;  thus  S  consists  of  the  paths  P(x,y)  for  all  ordered  pairs  a;,  y  of  vertices  of  T, 
and  we  shall  omit  the  subscripts  S  and  write  c{T),  7r(T)  instead  of  cs{T),  7rs{T). 

We  will  find  it  more  convenient  to  prove  a  weighted  version  of  the  theorem. 
A  weighted  tree  is  a  tree  T  with  positive  integer  weights  on  the  vertices 

X  of  T.  (The  intention  of  the  weights  is  to  have  a  vertex  of  weight  w  represent 
w  unweighted  vertices).  The  total  weight  of  a  set  X  of  vertices  of  T  is  w{X)  = 
Ylxex  ^(^)-  particular  w{T)  is  the  weight  of  the  entire  tree.) 

Let  e  be  an  edge  of  a  weighted  tree  T.  The  removal  of  e  from  T  results  in 
two  weighted  subtrees  Ti  and  T2.  The  load  of  e  is  the  product  w{Ti)w[T2).  The 
forwarding  index  of  the  weighted  tree  T  ,  denoted  by  7r(T),  is  the  maximum  load 
of  any  edge  in  T.  It  is  clear  that  when  all  weights  are  1  this  definition  coincides 
with  the  previous  definition  of  7r(T'). 

In  a  weighted  tree  T,  we  shall  consider  the  multiset  of  all  dipaths  which 
consists  of  w{a)w{b)  copies  of  the  dipath  from  a  to  b,  for  every  ordered  pair  of 
vertices  a,  b.  We  denote  by  c{T)  the  minimum  number  of  colours  in  a  proper 
colouring  of  the  multiset  of  all  dipaths.  When  all  weights  are  1,  the  multiset 
of  all  dipaths  is  precisely  the  set  of  all  dipaths,  and  so  the  definition  of  c{T) 
also  coincides  with  the  one  given  earlier.  For  a  particular  vertex  v,  we  let  In{v) 
(respectively  Out{v))  consist  of  those  dipaths  from  the  multiset  of  all  dipaths 
which  end  (respectively  begin)  with  v.  The  weighted  version  of  our  theorem  is 
as  follows: 

Theorem  2.  Any  weighted  tree  T  satisfies 

c{T)  =  k{T) 

and  there  exists  an  efficient  algorithm  which  colours  T  with  c[T)  colours. 


2.1  Two  operations  to  generate  weighted  trees 

There  is  a  natural  way  to  build  all  trees  from  a  single  edge,  by  adding  and  split¬ 
ting  leaves.  We  will  formally  define  these  operations  in  the  context  of  weighted 
trees,  and  then  apply  them  to  give  an  inductive  proof  of  our  theorem. 
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In  the  following  definition  we  assume  that  T  is  a  weighted  tree  of  weight 
w(T)  =  PV' ,  is  a  leaf  of  T,  /  is  the  parent  of  x,  and  finally,  that  <!i  is  a  positive 

integer  (5  < 

Definitions.  The  operation  AddLeafs(x,T)  modifies  T  as  follows: 

-  the  weight  of  x  is  decreased  by  <5 

-  a  new  node  y  is  added  with  weight  ^ 

~  the  edge  [j/,  x]  is  added. 

The  operation  S  pi  it  Lea  f six,  T)  modifies  T  as  follows: 

-  the  weight  of  x  is  decreased  by  S 

-  a  new  node  y  is  added  with  weight  d 

-  the  edge  [y,  f]  is  added.  (Recall  that  /  is  the  parent  of  x.) 

We  say  that  an  operation  AddLeafs(x,T)  or  SplitLeafsxT  is  legal  if  i +  «)(x)  < 
W  and  iu{x)  <W/2 

We  will  often  abbreviate  the  notation  to  simply  say  that  we  have  performed 
an  operation  AddLeaf  or  SplitLeaf  (with  respect  to  the  node  x  and  the 
weight  S  if  needed).  It  is  easy  to  see  that  if  an  operation  SplitLeafs{x,T) 
(resp.  AddLeaf s{x,T]  )  is  legal  then  in  the  new  tree  the  load  of  [x,  /]  and  [y,  f] 
(resp.  [x,  y])  cannot  be  larger  than  the  load  of  [x,  /]  in  T.  Therefore  we  have  the 
following  property. 

Properties  2.1  If  an  operation  AddLeaf  or  SplitLeaf  is  legal  then  the  foi- 
warding  index  of  the  new  tree  does  not  exceed  the  forwarding  index  ofT. 

Definition 4.  Let  T  be  a  weighted  tree,  and  let  W  denote  W{T).  T  is  called 
W/C-tree  if  the  two  trees  resulting  from  the  removal  of  an  edge  of  maximum 
load  have  weights  C  and  W  —  C,  with  C  >  lT/2. 

Notice  that  the  above  definition  is  non  ambiguous  since  each  edge  of  maxi¬ 
mum  load  is  associated  with  the  same  value  of  C  and  7r(T)  =C(W  -C).  in  case 
T  is  a  weighted  star  then  the  above  definition  is  equivalent  to  the  fact  that  the 
maximum  weight  of  a  leaf  is  [W  —  C)C;  we  call  T  a  W/C-star. 

Given  a  W/C-tree  T,  we  will  recursively  construct  T  from  some  initial  W/C- 
star  S  by  means  of  a  sequence  of  AddLeaf  dead  SplitLeaf  legal  operations.  By 
Property  2.1  this  will  assure  that  at  each  step  of  the  construction  we  have  a  tree 
with  forwarding  index  7r(5)  =  CiW  —  C)  —  7r(T). 

Lemma  5.  T  can  be  generated  from,  some  W/C-star  T*  by  repeated  application 
of  legal  operations  AddLeaf  or  SplitLeaf  . 

Proof  (Sketch).  We  first  show  that  any  W/C-tree  T  contains  a  vertex  u  such 
that  the  maximum  weight  of  a  component  of  T  \  {u}  is  IT  -  C. 

In  order  to  construct  our  tree  T,  we  start  from  the  IT/G-star  T  consisting 
of  the  vertex  u  and  all  its  neighbours  in  T;  for  each  neighbor  u  of  u  we  set  the 
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weight  w{v)  of  v  equal  to  the  weight  of  the  component  of  T  \  {«}  that  contains 

V. 

Let  t  be  the  number  of  nodes  of  T  which  are  not  adjacent  to  u.  If  i  =  0 
then  the  tree  T  is  a  I^/C-star  and  we  don’t  need  to  perform  any  operations. 
Otherwise  we  suppose  that  the  result  holds  for  if  t  <  /?,  and  let  t  ^  +  1.  Let  z 
be  a  leaf  of  T  of  maximum  distance  from  n,  and  let  p  be  the  parent  of  z.  This 
implies  that  p  has  at  most  one  neighbour  which  is  not  a  leaf. 

-  If  the  degree  of  p  is  strictly  greater  than  two,  then  let  x  be  a  leaf  neighbour 
of  p  other  than  2.  Let  T'  be  the  weighted  tree  obtained  from  T  by  removing 
2  and  increasing  the  weight  of  x  by  w(z).  Then  T  is  generated  from  T'  by 
the  operation  SplitLeafyj(^z){^)T'), 

~  If  the  degree  of  p  is  two,  then  let  x  =  p.  Let  T'  be  the  weighted  tree  obtained 
from  T  by  removing  2  and  increasing  the  weight  of  of  x  by  u;(2).  Then  T  is 
generated  from  T'  by  the  operation  AddLeaf^(^z){3c,T'). 

We  then  show  that  in  both  cases  the  operations  are  legal.  □ 


2.2  An  inductive  colouring 

We  have  seen  how  an  arbitrary  W/C-tree  T  can  be  constructed  from  a  W/C-star 
by  legal  operations,  with  all  intermediate  trees  being  also  W/C-trees.  We  now 
begin  to  prove  that  the  multiset  of  dipaths  in  each  of  these  trees  admits  a  proper 
colouring  with  W{W  —  C)  colours. 

Lemma  6.  The  multiset  of  dipaths  of  any  W/C-starT  can  be  efficiently  coloured 
with  W{W  —  C)  colours. 

Proof.  The  crucial  observation  here  is  the  following:  In  a  star,  two  dipaths 
conflict  (use  some  edge  in  the  same  direction  and  hence  must  obtain  differ¬ 
ent  colours)  if  and  only  if  they  have  the  same  beginning  or  the  same  end.  in 
other  words,  two  dipaths  of  the  same  colour  must  belong  to  two  different  mul¬ 
tisets  In{v)  and  to  two  different  multisets  Out{v).  For  each  vertex  v  we  have 
\In{v)\  —  \Oui{v)\,  but  these  sizes  differ  from  vertex  to  vertex.  Of  course,  the 
maximum  |/n(u)|  =  7r(T).  We  now  add  to  each  In{v)  and  Out{v),  7r(T)  —  |/n(u)| 
artificial  paths  (consisting  of  the  single  vertex  u),  to  arrive  at  a  situation  where 
each  In{v)  and  Out{v)  has  exactly  7r(u)  dipaths.  Thus  the  union  of  any  k  sets 
Iji{v)  contains  at  most  k  sets  Out(v),  and,  according  the  the  theorem  of  Hall 
[28]  (Theorem  9.2.1),  one  can  efficiently  determine  a  set  of  dipaths  consisting 
of  exactly  one  representative  from  each  In(v)  and  from  each  Out{v).  These  di¬ 
paths  will  be  coloured  by  colour  1,  and  deleted  from  consideration.  Now  each 
In(v)  and  each  Out{v)  has  7r(T)  —  1  dipaths,  and  so  we  can  continue  as  above. 
Clearly,  this  will  produce  a  proper  colouring  of  the  multiset  of  dipaths  in  T  with 
7r(T)  =  W{C  -  W)  colours.  □ 
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We  continue,  assuming  that  we  have  a  W/C-tree  T  with  a  proper  colouring 
of  its  multiset  of  dipaths  with  W(W  -  C)  colours,  and  show  how  to  induce  a 
proper  colouring  of  a  tree  T'  obtained  by  a  legal  operation. 

Let  X  be  a  fixed  leaf  in  T  (the  leaf  on  which  we  shall  perform  the  legal 
operation  AddLeaf  or  Split  Leaf  ).  We  wish  to  use  again  the  Theorem  of  Hall 
in  a  fashion  similar  to  the  above  proof,  but  treating  only  the  multiset  of  dipaths 
starting  (and  ending)  with  x.  These  dipaths  are  already  coloured,  and  we  may 
have  used  different  colours  for  Out{x)  and  In{x).  We  deal  with  this  complication 
by  introducing  the  following  bijection: 

Let  Out  denote  the  set  of  colours  used  on  the  dipaths  from  the  multiset 
Out(x) ,  and  let  In  denote  the  set  of  colours  used  on  the  dipaths  from  the  multiset 
In{x).  Since  all  dipaths  in  Out(x),  and  in  In{x),  have  different  colours,  \Out\  = 
Let  (i>  be  a  fixed  bijection  between  Out  and  In,  such  that  for  any  c  G 
Out  n  In,  </)(c)  =  c. 

Notice  that  the  w[x)w[z)  dipaths  between  x  and  any  vertex  2:  of  T  \  {x} 
must  all  obtain  different  colours,  as  they  all  use  the  unique  edge  out  of  x.  We 
now  arbitrarily  fix  (for  each  vertex  z)  a  partition  of  these  w{x)w(z)  colours 
into  ie(z)  classes  of  size  ie(x)  denoted  by  01,02,  Similarly,  we  fix 

(for  each  z)  another  partition  of  the  set  of  w{x)w(z)  colours  of  dipaths  from  2: 
to  X  into  each  of  size  w{x).  We  shall  say  that  two  colours  on 

dipaths  starting  in  x  are  I-equivalent  if  they  belong  to  the  same  class  Ij  for 
some  z  eT  \  {x},  j  G  {1,2,  •  -,w(z).  Similarly,  we  shall  say  that  two  colours 
on  dipaths  ending  in  x  are  0-equivalent  if  they  belong  to  the  same  class  Oj  for 
some  z  eT  \  {x},j  G  {1,2,  •  •  ■,u){z). 

Definition?.  A  supercolour  is  a  set  U  of  colours  such  that  no  colours  from  U 
are  /-equivalent,  and  no  colours  from  ^{U)  are  0-equivalent. 

Let  A  be  the  set  of  w(x)(W  -  w(x))  colours  used  by  the  dipaths  starting  in 

X. 

Lemma  8.  The  set  X  of  colours  can  be  partitioned  into  w(x)  supercolours. 
Proof  Omitted.  ^ 

The  following  result  allows  to  complete  the  proof  of  Theorem  2. 


Proposition9.  If  T  is  a  W/C-tree  with  a  proper  colouring  of  its  multiset  of 
all  dipaths,  and  if  T  is  obtained  from  T  by  performing  the  legal  operation 
AddLeaf5(x,T)  or  SplitLeafs{x,T),  then  V  is  a  W/C-tree  which  also  admits 
a  proper  colouring  of  its  multiset  of  all  dipaths.  Such  a  colouring  of  T'  can  be 
efficiently  determined. 

Proof  Omitted.  ^ 
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3  General  sets  of  paths 

We  have  shown  that  c[T)  =  7r(T)  for  any  tree  T,  even  in  the  general  case  of 
weighted  trees.  Thus  the  set  (multiset)  of  all  dipaths  P[x^y)  can  be  coloured 
with  7r(T)  colours.  Our  proof  represents  a  polynomial  algorithm  for  the  actual 
assignment  of  the  colours.  In  the  more  general  situation  of  an  arbitrary  set  S  of 
dipaths,  it  is  known  that  the  problem  of  optimally  colouring  the  paths  in  the  set 
S  is  NP-hard  [12],  and  only  approximation  algorithms  are  known  [23,  18,  19]. 

The  undirected  version  of  the  problem,  that  is,  minimize  the  number  of 
colours  in  a  colouring  of  paths  of  a  tree  T  so  that  all  paths  using  an  edge  of 
T  have  different  colours,  is  also  NP-hard  [15,  27]. 

In  this  section  we  make  some  additional  remarks  about  cs(T),  that  is,  the 
minimum  number  of  colours  in  a  colouring,  of  the  paths  from  a  subset  S  of  all  the 
paths  on  a  directed  symmetric  tree  T,  such  that  conflicting  paths  obtain  different 
colours.  We  consider  situations  in  which  cs{T)  can  be  efficiently  evaluated. 

It  is  easy  to  see  that  if  T  is  a  path  or  a  star  then  7^s{T)  =  cs{T)  for  every  S\ 
In  fact,  when  T  is  a  path  7r5(T)  =  cs(T)  is  equivalent  to  the  fact  for  an  interval 
graph  the  chromatic  number  is  equal  to  the  maximum  clique  size  [14],  and  when 
r  is  a  star  7r5(T)  =  cs{T)  is  equivalent  to  the  fact  that  for  a  bipartite  graph 
the  edge  chromatic  index  is  equal  to  the  maximum  degree  [9].  These  results  also 
imply  corresponding  polynomial  algorithms  [9,  14].  We  now  extend  these  results 
(and  algorithms)  as  follows. 

Definition  10.  The  conflict  graph  of  set  of  paths  S'  on  a  tree  T  is  the  undirected 
graph  whose  vertices  are  the  dipaths  from  S,  and  two  dipaths  are  adjacent  if 
and  only  if  they  conflict,  i.e.,  use  an  edge  of  T  in  the  same  direction. 

Definition  11.  A  generalized  star  is  a  tree  obtained  from  a  star  by  replacing 
each  edge  with  a  path  (the  paths  may  have  different  lengths). 

Notice  that  a  generalized  star  is  a  tree  in  which  at  most  one  vertex  has 
degree  greater  than  two,  and,  conversely,  any  tree  in  which  at  most  one  vertex 
has  degree  greater  than  two  is  a  generalized  star.  Also  note  that  stars  and  paths 
are  generalized  stars.  We  proceed  to  prove  that  all  conflict  graphs  in  a  generalized 
star  T  are  perfect;  this  will  imply  in  particular  that  7:s{T)  —  cs{T)  for  all  S. 

Definition  12.  An  odd  hole  of  an  undirected  graph  is  an  induced  cycle  with¬ 
out  chords,  of  odd  length  greater  than  three.  An  odd  antihole  is  an  induced 
complement  of  a  cycle  without  chords,  of  odd  length  greater  than  three. 

Lemma  13.  The  conflict  graph  of  of  any  set  of  paths  on  a  generalized  star  cannot 
contain  an  odd  hole  or  an  odd  antihole. 

Proof  Omitted.  □ 

Since  it  is  not  hard  to  show  that  the  conflict  graph  of  trees  satisfies  the 
perfect  graph  conjecture,  the  above  Lemma  13  implies  that  conflict  graphs  in 
generalized  stars  are  perfect;  therefore  we  have  the  following  result. 
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Corollary  14.  For  any  set  S  of  dipaths  in  a  generalized  starT  we  have  cs[T)  = 

^s{T). 


We  remark  that  by  combining  polynomial  algorithms  for  edge  colouring  bi¬ 
partite  multigraphs  and  for  vertex  colouring  interval  graphs,  we  obtain  a  poly¬ 
nomial  algorithm  for  colouring  the  dipaths  of  5  in  a  generalized  star  with  'Ks(T) 
colours.  It  is  not  difficult  to  observe  that  whenever  T  is  a  tree  other  than  a  gen¬ 
eralized  star,  then  there  exists  a  set  of  dipaths  5  in  T  such  that  tts{T)  cs(T). 

Proposition  15.  cs(T)  =  7rs{T)  for  all  sets  S  if  and  only  ifT  ts  a  generalized 
star. 


We  also  consider  the  following  condition  on  S  which  assures  that  7rs{T)  — 
cs{T): 

Definition  16.  A  set  of  paths  S  is  well  distributed  in  T  if  T  does  not  contain 
an  odd  number  of  edges  [v,  ai]^[v,  02], .  ■  .[v,  a2k-\-i]  such  that  some  dipath  of 
5  contains  both  edges  [u,  a^]  and  [u,aj-{-i]  (in  some  direction),  for  any  index 
i  =  1, .  .  .2k  I  (addition  on  index  is  taken  modulo  2Ar  -f-  1). 

Proposition  17.  If  S  is  well  distributed  in  T  then 

cs(T):=^7rs{T) 

and  cs(T)  can  be  found  in  polynomial  time. 

Proof  (Sketch).  We  verify  that  if  S  is  well  distributed  in  T  then  T  admits 
an  orientation  such  that  each  dipath  in  S  either  uses  all  edges  in  the  chosen 
direction,  or  all  in  the  opposite  direction.  Since  a  path  which  uses  edges  in  the 
chosen  direction  cannot  conflict  with  a  path  which  uses  edges  in  the  opposite 
direction,  we  can  colour  each  set  separately.  It  is  easy  to  see  that  the  conflict 
graph  of  each  of  these  sets  is  chordal  and  hence  c  =  tt  and  tt  can  be  found  in 
polynomial  time  [14]. 
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Abstract.  The  paper  deals  with  on-line  routing  in  WDM  (wavelength 
division  multiplexing)  optical  networks.  A  sequence  of  requests  arrives 
over  time,  each  is  a  pair  of  nodes  to  be  connected  by  a  path.  The  problem 
is  to  assign  a  wavelength  and  a  path  to  each  pair,  so  that  no  two  paths 
sharing  a  link  are  assigned  the  same  wavelength.  The  goal  is  to  minimize 
the  number  of  wavelengths  used  to  establish  all  connections. 

We  consider  trees,  trees  of  rings,  and  meshes  topologies.  We  give  on-lme 
algorithms  with  competitive  ratio  O(logn)  for  all  these  topologies.  We 
give  a  matching  C(log  n)  lower  bound  for  meshes.  We  also  prove  that  any 
algorithm  for  trees  cannot  have  competitive  ratio  better  than 

We  also  consider  the  problem  where  every  edge  is  associated  with  paral¬ 
lel  links.  While  in  WDM  technology,  a  fiber  link  requires  different  wave¬ 
lengths  for  every  transmission,  SDM  (space  division  multiplexing)  tech¬ 
nology  allows  parallel  links  for  a  single  wavelength,  at  an  additional  cost. 
Thus,  it  may  be  beneficial  in  terms  of  network  economics  to  combine  be¬ 
tween  the  two  technologies  (this  is  indeed  done  in  practice).  For  arbitrary 
networks  with  I?(logn)  parallel  links  we  give  an  on-line  algorithm  with 
competitive  ratio  O(logn). 


1  Introduction 

All-optical  networks  promise  data  transmission  rates  several  orders  of  magni¬ 
tude  higher  than  current  networks.  The  high  speeds  in  these  networks  arise  from 
maintaining  signals  in  optical  form  throughout  a  transmission  thereby  avoid¬ 
ing  the  overhead  of  conversions  to  and  from  electrical  form  (see  [Gr92]  for  an 
overview  of  the  topic).  Wavelength  division  multiplexing  (WDM)  supports  the 
propagation  of  multiple  laser  beams  of  distinct  wavelengths  through  an  optic 
fiber.  Thus,  the  high  bandwidth  of  the  WDM  network  is  utilized  by  partitioning 
it  in  many  “channels” ,  each  at  a  different  optical  wavelength.  Intuitively,  we  may 
think  of  wavelengths  as  light  rays  of  different  colors.  ^  i-.  i 

A  major  algorithmic  problem  for  optical  networks  is  that  or  routing,  rjacn 
routing  request,  consists  of  a  pair  of  nodes  in  the  network,  and  requires  the 
assignment  of  a  path  and  a  wavelength  (color).  The  key  restriction  is  that  two 
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requests  with  equal  wavelength  cannot  be  routed  through  the  same  link.  The 
main  goal  is  in  lowering  the  number  of  wavelengths  for  certain  routing  requests. 

Many  of  the  applications  for  high  speed  optical  networks  are  real-time.  It  is 
therefore  very  natural  to  consider  the  problem  of  routing  in  an  on-line  setting 
where  routing  requests  appear  over  time. 

The  Path  Coloring  Problem.  The  routing  problem  on  a  WDM  network  with 
generalized  switches  is  referred  to  as  path  coloring.  More  formally,  let  G  =  {V^E) 
be  a  graph  representing  the  network,  with  \V\  —  n.  We  are  given  a  sequence  of 
routing  requests  consisting  of  pairs  Pi  =  (si^ti)  of  nodes  in  G.  The  algorithm 
must  assign  a  path  connecting  Si  and  U  and  a  color,  so  that  no  two  paths  sharing 
an  edge  are  assigned  with  same  color.  The  goal  is  to  minimize  the  number  of 
colors.  The  performance  measure  for  an  on-line  algorithm  is  the  competitive 
ratio  [ST85]  defined  as  the  worst  case  ratio  over  all  request  sequences  between 
the  number  of  colors  used  by  the  on-line  algorithm  and  the  optimal  number  of 
colors  necessary  on  the  same  sequence. 

While  in  WDM  technology,  a  fiber  link  requires  different  wavelengths  for  ev¬ 
ery  transmission,  SDM  (space  division  multiplexing)  technology  allows  parallel 
links  for  a  single  wavelength,  at  an  additional  cost.  This  can  be  profitable  since 
only  a  limited  number  of  wavelengths  are  available  in  practice.  The  two  technolo¬ 
gies  are  then  combined  to  find  an  efficient  trade  off  between  the  two  approaches. 
This  motivates  considering  a  generalization  of  the  path  coloring  problem  where 
a  link  of  a  color  is  replaced  with  a  number  of  parallel  links.  We  will  alternatively 
model  this  case  with  a  bandwidth  B  available  on  a  link  for  any  color,  meaning 
that  B  paths  of  the  same  color  may  be  routed  through  a  link  (that  is  in  the 
basic  path  coloring  problem  B  1). 

Related  previous  work. 

The  on-line  path  coloring  problem  has  been  studied  by  Raghavan  and  Up- 
fal  [RU94]  who  give  constant  approximation  algorithms  for  undirected  trees 
and  trees  of  rings.  Further  results  for  trees  were  given  in  a  sequence  of  pa¬ 
pers  [MKR95,  KP96,  KS97,  EJKP97].  Rings  have  been  recently  addressed  in 
[GK97]  and  meshes  were  studied  in  [RU94,  AR95,  KT95].  Kleinberg  and  Tar- 
dos  [KT95]  give  an  O(logn)  approximation  algorithm  for  meshes  and  certain 
“nearly  Eulerian  planar  graphs” .  Rabani  [Ra96]  improves  the  bound  for  meshes 
to  Ofpoly(loglogn)b 

The  on-line  path  coloring  problem  has  been  studied  in  the  case  of  a  line 
topology  in  the  context  of  interval  graph  coloring  by  Kier stead  and  Trotter 
[KT81].  They  give  an  optimal  3-competitive  algorithm  for  the  line  {[KT81]). 

Slusarek  [S195]  proved  the  same  bound  for  circular  arc  graphs. 

The  path  coloring  problem  is  closely  related  to  the  virtual  circuit  routing 
problem,  motivated  by  its  application  to  ATM  networks.  The  load  version  of  this 
problem  is  where  every  requested  pair  must  be  assigned  a  path  as  to  minimize 
the  maximum  number  of  paths  crossing  a  given  edge.  Aspnes  et  al.  [AAFPW93] 
give  an  0(log  n)  competitive  algorithm  for  the  load  version.  Most  of  the  work  has 
concentrated  on  the  throughput  version  of  the  problem,  where  every  requested 
pair  may  be  either  accepted  or  rejected.  The  basic  problem,  also  referred  to  as 
call-control,  is  where  the  paths  of  all  accepted  pairs  must  be  edge-disjoint.  This 
can  also  be  generalized  to  the  case  where  edges  may  have  a  given  bandwidth 
B  (that  can  be  viewed  as  having  B  parallel  edges).  Awerbuch  et  al.  [AAP93] 
prove  that  if  R  =  i7(logn)  then  there  is  an  O(logn)  competitive  algorithm  (for 
throughput).  They  also  give  a  lower  bound  of  0{n)  for  deterministic  algorithms 
in  the  general  case.  Randomized  algorithms  have  been  first  studied  by  Awerbuch 
et  al.  [ABFR94,  AGLR94]  giving  an  0(logZ\)  competitive  algorithm  for  trees, 


518 


where  A  is  the  diameter  of  the  tree.  They  also  show  a  matching  lower  bound. 
Kleinberg  and  Tardos  [KT95]  give  O(logn)  competitive  algorithm  for  meshes 
(and  some  generalization),  improving  upon  a  previous  result  of  [AGLR94]. 

Bartal  et  al.  [BFL96]  prove  that  for  various  routing  problems  including  the 
throughput  version  of  virtual  circuit  routing  and  the  path-coloring  problem  there 
exist  networks  where  the  competitive  ratio  is  i7(n^)  (for  some  fixed  e)  for  any 
randomized  algorithm.  Finally,  the  on-line  version  of  maximizing  the  throughput 
in  optical  networks  was  addressed  in  [AAFLR96]. 

Contributions  of  this  paper.  We  consider  the  on-line  path  coloring  problem 
on  trees,  trees  of  rings,  and  meshes  topologies: 

-  We  present  an  O(logn)  competitive  deterministic  algorithm  for  path  col¬ 
oring  on  meshes. 

-  We  prove  a  matching  i7(logn)  lower  bound  for  the  mesh.  The  lower  bound 
holds  for  randomized  algorithms  for  the  load  version  of  the  virtual  circuit 
problem  which  immediately  extends  to  the  path  coloring  problem. 

We  comment  that  this  also  provides  the  first  lower  bound  for  the  load 
version  of  the  virtual  circuit  routing  problem  in  undirected  networks  with 
unit  edge  capacities  [AAFPW93]. 

-  We  give  an  O(logn)  competitive  algorithm  for  path  coloring  on  arbitrary 
networks  with  bandwidth  f2(logn)  (the  actual  statement  is  somewhat  more 
general).  This  algorithm  is  also  used  as  a  building  block  for  our  algorithm 
for  path  coloring  on  meshes.  This  result  can  be  viewed  as  a  balanced  com¬ 
bination  of  WDM  and  SDM  technologies. 

-  We  give  an  O(logn)  competitive  algorithm  for  trees  and  trees  of  rings. 

We  also  prove  that  any  deterministic  algorithm  for  trees  cannot  have  com¬ 
petitive  ratio  better  than  (even  for  trees  with  A  =  O(logn)). 

A  logarithmic  upper  bound  and  an  i7(-\/logn)  lower  bound  for  trees  have 
been  independently  obtained  by  Borodin,  Kleinberg,  and  Sudan  [BKS96]. 

Paper  structure:  Section  2  contains  the  results  for  path  coloring  with  more 
bandwidth  on  arbitrary  networks,  that  are  also  used  in  Section  3  for  the  O(logn) 
competitive  algorithm  for  path  coloring  on  meshes.  Section  4  contains  the  lower 
bound  for  meshes.  Upper  and  lower  bounds  for  trees  are  in  Section  5.  The  results 
and  the  proofs  that  are  omitted  from  this  abstract  can  be  found  in  [BL97]. 


2  Path  coloring  with  more  bandwidth 

Let  G  =  (V,E)  be  a  network  with  |y|  =  n  vertices  and  \E\  =  m  edges.  We 
consider  the  path  coloring  problem  with  bandwidth  B  on  the  edges.  At  the  j- 
th  step,  call  j,  with  endpoints  is  presented  to  the  algorithm  that  must 

assign  a  color  c{j)  and  a  path  -P(i).  The  goal  of  the  on-line  is  to  use  a  set  of 
colors  of  minimum  cardinality  C  under  the  constraint  that  the  bandwidth  on 

any  edge  does  not  exceed  B.  ^  i  rr^i  i  -xi. 

We  give  an  algorithm  for  general  networks  for  this  problem.  Ihe  algorithm 
fixes  a  set  C  oi  C  colors  that  it  may  choose  from,  at  the  beginning,  based  on 
an  estimate  for  the  optimal  performance.  The  basic  algorithm  chooses,  at  every 
step,  one  path  and  one  of  these  colors  according  to  some  optimization  criteria. 
This  criteria  assigns  to  any  edge  of  any  color  an  exponential  function  of  the 
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current  load.  Our  goal  is  in  proving  that  the  algorithm  never  exceeds  a  certain 
bandwidth  on  every  edge. 

A  variant  for  this  algorithm  proves  to  be  useful  (see  Section  3)  in  obtaining 
an  algorithm  for  path  coloring  on  meshes  (with  edge  bandwidth  =  1). 

In  this  variant  we  restrict  the  choice  or  the  on-line  algorithm  for  call  j  to  a 
subset  C{j)  of  C  (that  may  be  chosen  according  to  some  arbitrary  rule)  whose 
cardinality  is  at  least  aC. 

We  thus  state  our  results  in  terms  of  this  parameter  a.  However,  for  the  scope 
of  this  section  alone  it  is  enough  to  set  a  =  1. 

Let  C*  be  the  number  of  colors  used  by  the  optimal  solution  to  accommodate 
the  whole  set  of  calls,  and  let  B*  be  the  bandwidth  available  by  the  optimal 
solution  on  any  edge  of  any  color. 

We  compare  our  algorithm  to  a  stronger  adversary  that  uses  a  bandwidth 
A*  <  B*C*  on  a  single  color,  rather  than  being  restricted  to  using  C*  colors 
and  bandwidth  B*  for  every  color. 

We  assume  that  the  on-line  algorithm  knows  a  value  A  such  that  A*  <  A  < 
2A*.  This  is  performed  by  applying  a  doubling  technique  (whose  description  is 
omitted  in  this  abstract)  that  results  in  increasing  the  competitive  ratio  at  most 
by  a  factor  of  4. 

Let  the  load  on  edge  e  for  color  c,  denoted  by  Ag(j),  be  the  number  of  calls 
assigned  with  color  c  and  a  path  crossing  edge  e  when  call  j  is  presented.  Let 
a  =  2^,  Call  j  is  assigned  with  a  color  c(j)  and  a  path  P{j)  which  achieve  the 
minimum,  over  all  the  colors  in  C{j)  and  all  paths  connecting  sj  and  tj,  of  the 
following  “exponential  cost” : 
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Theorem  1.  If  the  number  of  colors  used  by  the  algorithm  is  C  =  8ylT(2^  —  1) 
where  A>  A*  then  the  bandwidth  is  B  <l-\-  ^  log  ^ . 


Proof.  _ 

Let  A  be  the  maximum  load  on  any  edge  for  any  color  in  the  solution  of 
the  on-line  algorithm  at  the  end  of  the  sequence.  Thus  A  calls  are  assigned  with 
same  color  and  a  path  crossing  a  given  edge.  When  the  last  such  path  P{k)  is 

assigned  to  a  call  k,  its  exponential  cost  is  at  least  By  definition  of  the 

algorithm,  the  chosen  path  is  the  minimum  cost  path  over  all  paths  and  colors 
C{k).  Therefore,  at  the  time  this  call  arrived,  for  C{k)  >  aC  colors,  any  path 

connecting  the  same  pair  of  vertices  has  a  cost  of  at  least  ,  It  follows  that 
the  sum  of  the  exponential  costs  over  all  edges  and  all  colors  at  the  end  of  the 
sequence  is 


Z(f)>aCa^-\  (1) 

where  X  (/)  indicates  the  value  of  a  function  X  at  the  end  of  the  sequence. 
Let  l*{j)  be  the  number  of  calls  in  the  adversary  solution  assigned  with  path 
crossing  edge  e  when  call  j  is  presented. 

We  use  the  following  potential  function: 


cGC  eeE 


520 


The  sum  of  the  exponential  costs  of  the  on-line  algorithm  at  the  end  of  the 
sequence  is  also  bounded  by  the  following; 


Z(f)  <  <  2(#(/)  -  #{0))  +  2mC,  (2) 

cGC  eeE 


In  the  following  we  prove  that  for  the  claimed  choice  of  C,  the  potential 
function  does  not  increase  after  each  step  of  the  algorithm.  Therefore  ^(/)  < 
^(0),  and  thus  the  equations  1  and  2  can  be  combined  to  achieve: 

A<l  +  ilog2f. 

To  complete  the  proof  we  prove  that  if  C  =  2/1^  (2^  -  1)  then  for  every 
j,  -h  1)  -  ^U)  <  0.  An  extra  factor  of  4  is  due  to  the  application  of  the 
doubling  technique  to  estimate  the  value  of  A.  Let  P*{j)  be  the  path  assigned 
by  the  adversary  for  call  j.  The  change  in  the  potential  function  due  to  call  j  is: 


#(i  + 1)  -  m  <  E  ^ E  + 1) 
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Observe  that  for  any  color  ceC{j),  the  cost  of  any  path  P  connecting  sj  to 
tj  is  not  less  than  the  cost  of  the  path  P(j)  on  color  c(j)  chosen  by  the  on-line 
algorithm  for  call  j.  Therefore  we  get  for  any  c  G  C{j): 


eeP 


eeP{j) 


The  above  inequality  also  holds  for  P  =  P*{j)->  and  hence 
#(j  +  1)  -  ${i)  <  ((a  -  1)  -  fx)  Eespy) 


Recall  that  a  =  2^3  -  1.  Thus,  by  choosing  C  =  2Ti(2'3  -  1)  we  have  that 
the  potential  function  does  not  increase.  ■ 

As  an  application  we  get  the  following  result  for  the  on-line  load  balancing 
problem  ([AAFPW93]),  in  which  only  one  color  is  available  and  the  goal  is  to 
minimize  the  number  of  paths  assigned  to  a  single  edge  of  the  network. 

By  applying  Theorem  1  with  =  1  and  a  =  l  we  get: 

Corollary  2.  There  exists  an  algorithm  for  on-line  load  balancing  that  uses 
0{A*)  colors  with  bandwidth  O(logn). 

Note  that  Corollary  2  gives  a  stronger  result  than  that  of  [AAFPW93]  that 
only  shows  that  the  on-line  load  is  bounded  by  0(A*  logn).  d* 

Finally,  going  back  to  the  path  coloring  problem,  recall  that  A  <  C  B  . 
For  an  appropriate  choice  of  jS  (the  proof  is  omitted),  Theorem  1  implies  the 
following: 

Corollary  3.  Let  S  be  such  that  B*  ^  Jlog^,  and  /ei  7  >  0  be  some  positive 
coefficient.  The  algorithm  for  on-line  path  coloring  with  more  bandwidth  uses 
C  <  8(7*^  (log  ^(2^  —  1)  -h  1)  colors  with  bandwidth  B  <  jB* . 
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The  above  corollary  shows  that  if  the  bandwidth  is  i7(logn),  then  the  on¬ 
line  algorithm  does  not  exceed  the  bandwidth  by  using  O(iogn)  more  colors. 
We  thus  obtain  the  result  for  optical  networks  with  general  topology  when  the 
technologies  WDM  and  SDM  are  combined  in  a  network  that  contains  /?(logn) 
parallel  fiber  optic  links  on  each  connection. 


3  Path  coloring  on  meshes 

In  this  section  we  present  an  O(logn)  competitive  algorithm  for  path  coloring 
on  meshes. 

G  —  {V,E)  denotes  the  ^/rl  x  ^  two  dimensional  mesh.  We  consider  ^/n  to 
be  a  power  of  2.  Let  |E|  =  m  be  the  number  of  edges  of  the  mesh.  The  vertex 
of  the  mesh  with  row  i  and  column  j  is  denoted  with  G[i,j].  Given  two  vertices 
V  —  G[iJ],v'  —  G[i'J']  we  define  their  distance  as  the  length  of  the  shortest 
path  connecting  the  two  vertices:  d{v,  v')  =  |z  —  z'J  +  |j  —  J^|. 

Let  a  and  cr  be  parameters  that  will  be  fixed  later.  Calls  are  divided  into 
short  calls  and  long  calls.  A  call  (s,t)  is  long  if  d{s,t)  >  2crlog^,  and  short  if 
d(s,  t)  <  2(7  log  a  and  a  will  be  chosen  so  that  a  log  ^  is  a  power  of  two. 

We  use  two  (Afferent  algorithms  for  long  calls  and  short  calls.  The  algorithm 
for  long  calls  translates  the  problem  in  a  mesh,  to  a  problem  of  coloring  with 
more  bandwidth  in  a  simulated  network  that  is  also  a  mesh.  Theorem  1  al¬ 
lows  a  logarithmic  competitive  ratio  with  a  logarithmic  bandwidth  on  any  edge. 
The  route  obtained  in  the  simulated  network  is  later  translated  into  a  route  in 
the  original  mesh,  satisfying  the  constraint  that  paths  associated  to  calls  with 
the  same  color  are  disjoint.  We  describe  in  Section  3.1  the  construction  of  the 
simulated  network,  and  in  Section  3.2  how  a  route  in  the  simulated  network  is 
transformed  into  a  route  in  the  original  mesh. 

The  algorithm  for  short  calls,  whose  description  is  omitted  in  this  abstract, 
classifies  the  calls  on  the  basis  of  their  length,  and  applies  a  greedy  algorithm 
within  each  class. 

Both  algorithms  for  long  and  short  calls  have  competitive  ratio  O(logn). 
Therefore,  we  can  state  the  following  theorem. 

Theorem  4.  There  exists  a  O(logn)  competitive  algorithm  for  path  coloring  on 
meshes. 

3.1  The  algorithm  for  the  simulated  network 

In  this  section  we  describe  the  algorithm  for  the  problem  of  coloring  and  routing 
calls  on  a  simulated  network  of  a  mesh  of  size  ^/n  x  ^Jn. 

The  algorithm  divides  the  mesh  into  ^  squares  of  size  a  log  ^  x 

(T  log  Square  5[p,  q],p^q  =  1,  •  •  • ,  is  the  subgraph  of  G  induced  by  the 

set  of  vertices  {G[i,j]\i  =  (p- l)o-log  ^-1-1, . . .  ,p(jlog  j  {g-l)(jlog^  + 
l,...,^cr  log^}. 

Note  that  long  calls  have  their  endpoints  in  different  squares,  since  the  dis¬ 
tance  between  the  endpoints  is  bigger  than  2<j  log  “ . 

The  simulated  network  N  of  the  mesh  G  =  (K  jEJ)  is  a  mesh  of  size  x 

V  ^  log  — 

y/n 

Let  m'  be  the  number  of  edges  of  the  simulated  network.  Every  edge  of  N 
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is  associated  with  a  bandwidth  equal  to  o' log  ^  =  (ilog^.  (Observe  that 
logm' =  logm  -  6){loglogm).  Hence  <7  Pi  (5  for  large  m.) 

This  mesh  corresponds  to  the  network  obtained  from  the  original  mesh  by 
contracting  every  square  of  G  onto  a  vertex  and  connecting  every  pair  of  vertices 
representing  adjacent  squares  with  an  edge.  The  bandwidth  of  the  edges  models 
the  fact  that  at  most  cr  log  ^  edge-disjoint  paths  can  pass  between  two  adjacent 
squares. 

The  basic  idea  is  to  color  and  route  long  calls  in  the  simulated  network  using 
the  algorithm  of  Section  2  for  path  coloring  with  more  bandwidth,  and  then 
translate  the  assigned  paths  into  an  appropriate  routing  in  the  original  network. 

The  sequence  of  long  calls  in  the  mesh  G  is  transformed  into  a  sequence  of 
calls  in  the  simulated  network  in  the  most  natural  way:  Each  long  call  (s,  t)  is 
replaced  by  a  call  between  the  two  vertices  of  N  representing  the  two  squares 
containing  s  and  t. 

The  path  obtained  for  a  call  in  the  simulated  network  is  transformed  into  a 
path  in  the  original  mesh  G  respecting  the  following  rule:  The  path  in  G  will 
cross  between  adjacent  squares  in  G  where  the  path  in  N  passes  through  the 
edge  connecting  the  corresponding  nodes  in  N. 

However  we  need  that  the  paths  with  same  color  crossing  any  square  are  edge 
disjoint.  For  this  purpose  we  will  restrict  the  set  of  candidate  colors  for  each  call 
to  a  constant  fraction  of  the  overall  number  of  colors.  (Observe  that  the  design 
of  the  algorithm  of  Section  3.2  includes  this  feature). 

For  this  purpose  we  distinguish  between  the  two  squares  that  include  the 
endpoints  of  a  call,  and  the  squares  that  are  crossed  by  the  path  connecting 
the  endpoints.  We  say  that  a  call  is  internal  to  a  square  if  one  of  its  endpoints 
belongs  to  the  square.  A  call  is  called  external  to  a  square  if  it  is  not  internal  to 
the  square  and  the  path  derived  by  the  routing  in  the  simulated  network  crosses 
the  square. 

We  furthermore  define  in  any  square  S  of  the  mesh,  three  concentric  regions: 

5^  and  (see  Figure  1).  Each  region  contains  2  log  ^  concentric  rings  of 
the  square.  5^  is  the  most  external  region,  5^  is  internal  to  5^  and  is  internal 
to  both  and  5^.  Finally,  the  area  surrounded  by  is  called  the  central  region 
of  the  square. 


Fig.  1.  The  routing  of  a  long  call. 

The  set  of  colors  C  used  by  the  on-line  algorithm  is  partitioned  into  three 
sets  C^,C^,C^  of  equal  size.  If  a  call  is  associated  with  a  color  c  e  C\  i  1,2,3, 
then  its  two  endpoints  must  lie  on  a  region  different  from  5%  while  the  path 
connecting  the  two  endpoints  will  cross  any  square  of  the  mesh  different  from 
the  two  squares  containing  the  endpoints  using  a  ring  of  region 

In  Section  3.2  we  will  snow  how  this  requirement  mlows  to  avoid  intersections 
between  calls  with  same  color  crossing  a  square. 

We  further  impose  an  additional  requirement:  for  any  square,  at  most  one 
internal  call  is  associated  with  any  color.  This  requirement  is  to  avoid  conflicts 
between  paths  assigned  to  internal  calls  that  leave  a  square. 
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Consider  the  j’th  long  call  Let  S{sj)  and  be  the  squares  con¬ 

taining  Sj  and  tj,  respectively.  The  set  C{j)  of  candidate  colors  for  call  j  is 
defined  as  follows.  A  color  c  e  C  is  in  C{j)  if  the  two  following  conditions  hold: 

1.  Sj  i  S\sj)  and  tj  f  S^{tj),  e.g.  both  endpoints  are  not  in  region  i  of  their 
corresponding  squares. 

2.  No  call  with  an  endpoint  in  S{sj)  or  S{tj)  has  been  previously  assigned 
with  color  c. 

The  algorithm  for  path  coloring  with  more  bandwidth  in  the  simulated  net¬ 
work  is  run  with  parameters  satisfying:  o:  <  ^  >  13;  and  'y  —  The  value 

a  that  defines  the  size  of  each  square  is  chosen  in  order  to  satisfy  cr  log  ^  - 

(5 log 

The  choice  of  the  parameters  is  such  that  the  adversary  bandwidth  B*  == 

^  log  ^  is  equal  to  the  maximum  number  of  calls  that  can  be  routed  through 

two  adjacent  squares,  and  the  width  ^log^  of  a  square  is  equal  to  13  times 

the  maximum  bandwidth  B  =  log  used  by  the  on-line  algorithm  for  routing 

between  two  adjacent  squares.  ^  i  i  r 

To  apply  the  result  of  Corollary  3  we  need  the  following  lemma  whose  proot 

is  omitted. 

Lemma  5.  The  set  of  feasible  colors  C{j)  for  a  call  has  size  at  least  aC. 

Therefore,  from  Corollary  3  we  can  derive  the  following  corollary,  on  the 
number  of  colors  and  the  bandwidth  used  by  the  on-line  algorithm  for  path 
coloring  with  more  bandwidth  in  the  simulated  network: 

Corollary  6.  The  algorithm  for  on-line  path  coloring  with  more  bandwidth  in 
the  simulated  network  N  uses  C  —  8C*^(log^^^ — h  1)  colors  with  bandwidth 

3.2  Routing  of  long  calls 

In  this  section  we  describe  how  to  transform  a  path  in  the  simulated  network 
N  into  a  path  in  the  mesh  G,  so  that  the  paths  associated  to  calls  with  same 
color  are  mutually  edge-disjoint,  A  path  in  the  simulated  network  indicates  the 
squares  to  cross  to  connect  the  two  endpoints  of  a  call.  We  are  left  to  describe 
the  route  followed  by  the  path  within  each  square. 

Given  a  color  c  ^  C\  the  set  of  calls  accepted  with  that  color  have  the 
following  property 

1.  At  most  one  call  is  internal  to  each  square. 

2.  Both  endpoints  of  each  call  are  outside  region  5*  of  their  squares. 

The  run  of  the  algorithm  for  path  coloring  with  more  bandwidth  ensures  that 
the  maximum  bandwidth  of  the  on-line  algorithm  in  the  simulated  network  is 
5  ^  log  It  follows  that  at  most  B  calls  are  assigned  with  paths  crossing 

the  boundary  between  two  adjacent  squares,  and  there  are  at  most  2B  external 
calls  for  each  square. 

We  will  maintain  inductively  the  following  property:  A  call  crosses  the  bound¬ 
ary  between  two  squares  on  a  row  or  on  a  column  connecting  the  central  regions 
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of  the  two  squares.  The  central  region  of  a  square  has  size  B  x  B.  Since  B  is 
the  maximum  number  of  calls  routed  between  adjacent  squares,  a  distinct  row 
or  column  can  be  associated  with  any  call. 

We  first  consider  external  calls.  By  induction,  each  external  call  enters  the 
square  on  a  row  or  on  a  column  leading  to  the  central  area.  We  route  it  towards 
the  central  area  until  a  free  ring  of  region  is  reached.  This  is  always  the  case 
since  there  are  at  most  2B  external  calls  and  2B  available  rings  in  each  region 
S\  The  call  then  follows  the  ring  until  it  reaches  a  free  row  or  a  free  column 
connecting  the  central  region  of  the  square  to  the  central  region  of  the  adjacent 
square  to  which  the  call  is  directed.  The  route  follows  such  row  or  such  column 
until  the  adjacent  square. 

Finally,  we  consider  the  routing  of  the  possible  single  internal  call.  The  end¬ 
point  of  the  internal  call  is  outside  the  area  S\  If  it  is  originated  in  the  central 
area,  then  it  can  be  routed  through  a  path  that  reaches  a  free  row  or  column 
that  connects  the  central  area  to  the  central  area  of  the  adjacent  square  to  which 
the  internal  call  is  directed.  The  route  goes  through  such  row  or  column  until 
the  adjacent  square  is  reached.  If  the  internal  call  has  the  endpoint  outside  both 
the  central  area  and  the  region  5%  it  is  routed  through  the  ring  on  which  the 
endpoint  lies  until  it  reaches  a  free  row  or  column  connecting  the  central  area  of 
the  square  to  the  central  area  of  the  adjacent  square  to  which  the  call  is  directed, 
and  then  follow  it  until  the  appropriate  adjacent  square. 

The  routing  of  a  call  associated  with  a  color  of  set  is  shown  in  Figure  1. 
In  particular,  it  is  described  the  route  followed  in  the  two  squares  where  the  call 
is  internal,  and  in  one  square  where  the  call  is  external. 


4  Lower  Bounds  on  Meshes 

In  this  section  we  give  a  randomized  lower  bound  of  i?{log  n)  for  the  path  coloring 
problem  on  meshes.  The  lower  bound  also  applies  to  the  load  balancing  problem 
([AAFPW93])  on  meshes. 

The  lower  bound  is  based  on  an  application  of  Yao’s  Lemma  to  on-line  algo¬ 
rithms.  We  construct  a  distribution  over  request  sequences,  such  that  the  number 
of  colors  used  by  an  optimal  algorithm  is  always  bounded  by  a  constant  while  the 
expected  on-line  load  (i.e.,  the  maximum  number  of  paths  crossing  an  edge)  of 
a  deterministic  algorithm  is  /2(logn).  We  recall  that  the  load  of  a  path  coloring 
algorithm  is  bounded  above  by  the  number  of  colors  and  thus  the  lower  bound 
follow. 

The  distribution  over  request  sequences  is  defined  recursively  in  L  —  log4  n 
stages  as  follows.  At  the  Fth  stage  of  the  recursion,  i  ~  1,2,...,L,  we  define 
a  probability  distribution  for  an  ^  4^-^+i  square  Si  of  the  mesh.  We 

consider  a  partition  of  Si  into  16  subsquares  of  size  x  4^”L  The  internal 
part  of  the  square  Si  is  defined  as  the  square  I  consisting  of  the  4  internal 
subsquares  in  the  above  partition.  5  \  /  is  called  the  external  part  of  the  square. 
Let  I[x,y]  denote  the  vertex  with  row  x  and  column  y  in  the  submesh  defined 
by  I  where  0  <  x,y  <  2  ■  4^~h  We  now  give  for  each  0  <  a:  <  2  •  4^"*  a  set  of  8 
vertical  calls  from  /[0,a;]  to  I[2  ■  4:^~\x].  Then  choose  at  random  one  of  the  16 
subsquares  and  proceed  with  the  {i  +  l)’st  stage  of  the  probability  distribution 
for  that  subsquare  recursively.  The  (L-M)’st  stage  of  the  probability  distribution 
contains  no  requests. 

The  next  two  claims  give  bounds  on  the  optimal  and  the  on-line  solutions. 

Claim  7  The  number  of  colors  used  by  an  optimal  algorithm  for  the  above  prob¬ 
ability  distribution  is  8. 
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Proof.  We  prove  the  claim  by  induction  on  z.  If  the  subsquare  of  size  x 

chosen  in  the  probability  distribution  is  not  in  the  internal  part  /,  then  we  route 
the  calls  given  in  the  z’th  stage  through  the  internal  part  of  the  square,  and 
otherwise  we  route  the  calls  through  the  external  part  of  the  square,  so  that 
none  of  the  routes  will  cross  the  routes  for  calls  in  stages  j  >  i.  This  can  be  done 
so  that  calls  with  distinct  source  and  destination  have  disjoint  paths  and  thus 
the  number  of  colors  is  8.  ■ 

Claim  8  Let  Ai  be  the  expected  average  load  of  the  on-line  algorithm  on  the 
edges  in  the  square  Si.  Then  Ai>i. 

Proof.  We  first  prove  that  the  average  increase  in  the  load  of  the  edges  of  the 
square  Si  due  to  the  requests  given  at  the  z’th  stage  is  at  least  1.  The  number 
of  edges  of  the  mesh  5^  is  2  x  42(i'-i+i),  The  requests  given  at  the  zth  stage 
include  8  x  2  x  4^“^  calls  between  pairs  of  vertices  such  that  any  path  between 
them  includes  at  least  2x4^“*  edges  in  Si  (even  if  the  path  passes  outside  the 
square).  Therefore,  the  increase  of  the  avera^  load  on  edges  of  5^  is  1. 

We  now  prove  by  induction  that  Ai  >  i.  For  z  =  1  it  follows  from  the  above 
claim.  We  assume  the  claim  holds  for  z  and  prove  it  for  z  + 1.  Since  the  subsquare 
for  the  (z  +  l)’st  stage  is  chosen  at  random  the  expected  average  load  of  the  edges 
of  is  equal  to  Ai.  Since  the  average  increase  in  the  load  of  the  edges  of  Si+i 
is  at  least  1  we  have  Ai+i  >  z  +  1.  ■ 

We  conclude  the  following. 

Theorem  9.  The  competitive  ratio  of  any  on-line  randomized  path  coloring  al¬ 
gorithm  on  meshes  is  /?(logn)  against  oblivious  adversaries.  The  same  lower 
bound  holds  for  load  balancing  on  meshes. 


5  Path  coloring  on  trees  and  trees  of  rings 

In  this  section  we  consider  the  on-line  path  coloring  problem  on  trees  and  on 
trees  of  rings. 

An  algorithm  for  trees  and  trees  of  rings  is  obtained  by  showing  that  these 
graphs  are  0(C)-inductive  graph,  where  C  is  the  maximum  number  of  paths 
that  crosses  an  edge,  which  is  a  lower  bound  on  the  optimal  cost.  We  omit  the 
proof  of  this  fact  in  this  abstract.  The  upper  bounds  follow  from  a  result  by 
Irani  [190]  that  the  greedy  on-line  coloring  algorithm  uses  O(dlogn)  colors  on  a 
d-inductive  graph  of  n  vertices.  We  can  therefore  conclude: 

Theorem  10.  There  exists  a  0(log  n) -competitive  algorithm  for  on-line  path  col¬ 
oring  on  trees  and  trees  of  rings  of  n  vertices. 

We  also  prove  the  following  lower  bound  on  the  competitive  ratio  of  determin¬ 
istic  algorithms  for  on-line  path  coloring  on  trees  whose  description  is  omitted 
in  this  abstract. 

Theorem  11.  Any  algorithm  for  path  coloring  on  trees  of  n  vertices  has  a  com- 
petitive  ratio  0/ J^CjgTogn)- 
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Abstract.  We  investigate  the  time  complexity  of  deciding  the  existence 
of  layouts  of  virtual  paths  in  high-speed  networks,  that  enable  a  connec¬ 
tion  from  one  vertex  to  all  others  and  have  maximum  hop  count  h  and 
maximum  edge  load  /  ,  for  a  stretch  factor  of  one.  We  prove  that  the 
problem  of  determining  the  existence  of  such  layouts  is  NP-complete  for 
every  given  values  of  h  and  /,  except  for  the  cases  h  =  2,1  =  1  and  h  =  1, 
any  I,  for  which  we  give  polynomial-time  layout  constructions. 


1  Introduction 

1.1  Motivation 

Asynchronous  Transfer  Mode  (ATM  for  short)  is  widely  accepted  as  the  most 
popular  architecture  that  supports  high-speed  networks,  and  is  thoroughly  de¬ 
scribed  in  the  literature  [14,  13,  16].  ATM  is  based  on  relatively  small  fixed-size 
packets,  that  are  routed  independently,  based  on  two  small  routing  fields  at 
their  header  (termed  virUial  channel  index  (VCI)  and  virtual  path  index  (VPI)). 
At  each  intermediate  switch,  these  fields  serve  as  indices  to  two  routing  tables, 
and  the  routing  is  done  in  accordance  to  the  predetermined  information  in  the 
appropriate  entries. 

Routing  in  ATM  is  hierarchical  in  the  sense  that  the  VCI  of  a  cell  is  ignored 
as  long  as  its  VPI  is  not  null.  This  algorithm  effectively  creates  two  types  of 
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IT  under  contract  N.  20244  and  by  the  Italian  MURST  40%  project  “Algoritmi, 
Modelli  di  Calcolo  e  Strutture  Informative”. 
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predetermined  simple  routes  in  the  network  -  namely  routes  which  are  based  on 
VPIs  (called  vvi.ual  paths  or  VPs)  and  routes  based  on  VCIs  and  VPIs  (called 
virtual  channels  or  VCs).  VCs  are  used  for  connecting  network  users,  and  VPs 
are  used  for  simplifying  network  management  (routing  of  VCs  in  particular). 
Thus  the  route  of  a  VC  may  be  viewed  as  a  concatenation  of  complete  VPs. 

As  far  as  the  mathematical  model  is  concerned,  given  a  communication  net¬ 
work,  the  VPs  form  a  set  of  simple  paths  in  the  network  (termed  the  virtual  path 
layout  (VPL  for  short))  on  the  same  vertices.  Each  VC  is  thus  a  concatenation 
of  such  virtual  paths. 

The  VP  layout  must  satisfy  certain  conditions  to  guarantee  important  per¬ 
formance  aspects  of  the  network  (see  [1,  12]  for  technical  justification  of  the 
model  for  ATM  networks).  In  particular,  there  are  restrictions  on  the  following 
parameters: 

The  hop  count:  The  number  of  VPs  which  comprise  the  path  of  a  VC  in  the 
virtual  graph.  This  parameter  determines  the  efficiency  of  the  setup  of  a  VC 
(see,  e.g.,  [4,  17,  18]). 

The  load:  The  number  of  virtual  paths  that  share  any  physical  edge.  This 
number  determines  the  size  of  the  VP  routing  tables  (see,  e.g.,  [6]). 

The  stretch  factor:  The  ratio  between  the  length  of  the  path  that  a  VC  takes 
in  the  physical  graph  and  the  shortest  possible  path  between  its  endpoints. 
This  parameter  controls  the  efficiency  of  the  utilization  of  the  network. 

In  many  works  (e.g.,  [2,  3,  12,  5]),  a  general  routing  problem  is  solved  using 
a  simpler  sub-problem  as  a  building  block j  In  this  sub-problem  it  is  required 
to  enable  routing  between  all  vertices  and  a  single  vertex  (rather  than  between 
any  pair  of  vertices).  This  restricted  problem  for  the  ATM  VP  layout  problem 
is  termed  the  rooted  (or  one-to-many)  VPL  problem  [12]  and  is  the  focus  of  the 
present  work. 

1.2  Related  Work 

A  few  works  have  tackled  the  VP  layout  problem,  some  using  empirical  tech¬ 
niques  [1,  15],  and  some  using  theoretical  analysis  [12,  5,  11]. 

The  VP  layout  problem  is  closely  related  to  graph-embedding  problems  since 
in  both  cases  it  is  required  to  embed  one  graph  in  another  graph.  However,  while 
in  most  embedding  problems  both  graphs  are  given,  here  we  are  given  only  the 
physical  (host)  graph,  and  we  can  choose  the  embedded  graph  (in  addition  to 
the  choice  of  the  embedding  itself) . 

Most  of  the  performance  parameters  are  also  different  in  both  cases; 

-  While  the  association  between  the  host  graph  and  the  embedded  graph  is 
made  by  the  dilation  parameter  in  embedding  problems,  here  it  is  made  by 
the  stretch  factor.  In  other  words,  in  embedding  problems  it  is  important  to 
minimize  the  length  of  each  individual  embedded  edge,  while  in  this  model 
it  is  important  to  minimize  the  length  of  paths. 
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—  The  hop  count  parameter  is  closely  related  to  the  distance  in  the  virtual 
graph,  however,  while  the  distance  depends  only  on  one  graph,  the  hop  count 
also  depends  on  the  physical  graph  (unless  the  stretch  factor  is  unbounded). 

-  The  load  parameter  is  identical  to  the  congestion  in  embedding  problems, 
and  the  different  terminology  is  due  to  the  loaded  meaning  of  congestion  in 
the  communication  literature. 

The  computational  complexity  of  determining  the  existence  of  a  VP  layout 
for  a  given  jietwork  within  a  given  maximum  hop  count  and  a  given  maximum 
load  was  investigated  in  [12],  where  the  authors  showed  that  this  problem  is  NP- 
complete  when  there  is  no  limit  on  the  stretch  factor.  In  [12]  also  some  polynomial 
construction  algorithms  are  given  for  trees  for  the  stretch  factor  equal  to  one, 
i.e.  when  the  physical  routed  paths  are  shortest. 


1.3  Summary  of  Results 

In  this  paper  we  improve  the  results  of  [12],  concerning  the  computational  com¬ 
plexity  of  constructing  virtual  path  layouts  from  a  given  node  to  all  other  nodes 
in  the  network.  While  in  [12]  the  maximum  hop  count  h  and  the  maximum  load 

1  are  not  constant,  here  we  tightly  establish  the  border  between  tractability  and 
intractability,  by  determining  the  lowest  (constant)  values  of  h  and  I  that  make 
the  problem  computationally  hard.  Moreover,  we  give  efficient  construction  al¬ 
gorithms  for  all  the  tractable  cases. 

Specifically,  we  show  that  the  problem  of  determining  the  existence  of  such 
layouts  is  NP-complete  for  every  given  values  of  h  and  /,  except  for  the  cases 
h  =  2J  :=  1  and  =  1,  any  /,  for  which  we  give  polynomial-time  constructions. 
All  results  in  this  paper  concern  the  stretch  factor  of  one. 

The  paper  is  organized  as  follows:  In  Section  2  we  define  the  model  and  the 
related  performance  measures.  In  Section  3  we  give  the  above-mentioned  NP- 
completeness  results.  In  Section  4  we  present  efficient  construction  algorithms  for 
the  polynomial  cases,  and  in  Section  5  we  conclude  and  list  some  open  problems. 
Some  proofs  are  only  briefly  sketched  in  this  Extended  Abstract. 

2  The  Model 

Following  [12]  we  model  the  underlying  communication  network  as  an  undi¬ 
rected  graph  G  —  (y,E),  where  V  corresponds  to  the  set  of  switches  and  F  to 
the  set  of  physical  links  between  them. 

Definition  1.  A  rooted  virtual  path  layout  (RVPL  for  short)  ^  is  a  collection 
of  simple  paths  in  G,  termed  virtual  paths  (VPs  for  short),  and  a  vertex  r  G  V 
termed  the  root  of  the  layout  (denoted  root{F)). 

Definition 2.  The  hop  count  Hiv)  of  a  vertex  r;  G  V  in  a  RVPL  iF  is  the 
minimum  number  of  VPs  whose  concatenation  forms  a  shortest  path  in  G  from 
V  to  root(^).  If  no  such  VPs  exist,  define  n{v)  =  oo.  (Note  that  the  assumption 
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of  stretch  factor  equal  to  one  is  reflected  by  the  requirement  of  using  shortest 
paths.) 

Definitions.  The  maximal  hop  couni  of  a  RVPL  ^  is  defined  as  ?fmaar(^)  = 

maxygy  (u)}. 

Definition4.  The  load  £(e)  of  an  edge  e  G  in  a  RVPL  ^  is  the  number  of 
VPs  0  G  that  include  e. 

Definitions.  The  maximal  load  jCmax{^)  of  a  RVPL  d/  is  maxg^^;  ^ 

To  minimize  the  load,  one  can  use  a  RVPL  ^  which  has  a  VP  on  each  physical 
link,  i.e.,  Cmax{^)  =  V  however  such  a  layout  can  have  a  hop  count  equal  to  the 
diameter  of  the  network.  The  other  extreme  is  connecting  a  direct  VP  from  the 
root  to  each  other  vertex,  yielding  7im.ax  -  1  but  usually  a  very  high  Cm.ax-  In 
general,  we  are  interested  in  the  intermediate  cases  where  we  trade  one  parameter 
for  the  other.  The  following  decision  problem  then  naturally  arises. 

Definition 6.  (fi,  i!)-RVPL  Problem: 

INSTANCE:  A  network  G  —  (V,  E)  and  a  given  root  r  G  V. 

QUESTION:  Is  there  a  {hj)-  RVPL  for  G  with  root  r,  i.e.  a  RVPL  such 
that  Timaxi^)  <  ^  Emaxi^)  < 

3  The  NP-complete  Cases 

In  this  section  we  tightly  establish  the  values  of  h  and  I  that  make  the  problem 
of  determining  the  existence  of  virtual  path  layouts  NP-complete.  Namely,  we 
prove  the  following  theorem. 

Theorem7.  The  {h,l)-RVPL  problems  are  NP-compleie  for  any  h  and  I  except 
for  the  cases  h  =  1,  any  I  and  h  ^  2,1  =  1, 

First  observe  that  the  {/i,/)-RVPL  problems  belong  to  the  class  NP.  In  fact, 
given  an  RVPL  ^  for  G  =  (V,  E)  with  a  given  root  r  G  V,  one  can  easily  check 
whether  C{e)  <  I  for  every  edge  e  G  E  and  whether  7imax{^)  <  h.  For  the  latter 
task  we  define  a  weighted  graph  G'  =  {V,  E'),  termed  virtual  graph,  with  an  edge 
of  weight  /  connecting  vertices  a  and  h  if  and  only  if  there  is  a  virtual  path  of 
length  I  between  them;  then,  if  d  is  the  (unweighted)  distance  between  r  and  v 
in  G,  by  using  slight  modifications  of  usual  shortest  path  algorithms  we  verify 
that,  for  every  vertex  v  eV  —  {r},  there  is  a  path  from  r  to  u  in  G'  of  length  d 
that  has  at  most  h  edges. 

We  prove  Theorem  7  in  the  following  four  lemmas.  In  Lemma  8  we  prove  that 
(3,  1)-RVPL  is  NP-complete.  In  Lemmas  9  and  10  we  prove  that  for  every  I  the 
{2,/)-RVPL  problems  are  NP-complete.  Finally,  we  prove  in  Lemma  11  that  for 
every  h  and  I,  if  {h,  /)-RVPL  is  NP-complete  then  so  is  {h  -f  1,  /)-RVPL.  Thus, 
the  first  three  lemmas  establish  the  basis  of  an  inductive  proof  and  Lemma  1 1 
is  the  inductive  step. 

^  As  mentioned  above,  the  load  on  an  edge  is  identical  to  its  congestion. 
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Lemma  8.  The  (3,  \)-RVPL  problem,  is  NP-compleie. 

Sketch  of  proof.  In  order  to  prove  the  NP-completeness  of  the  {3,  1)-RVPL 
problem,  we  provide  a  polynomial  time  transformation  from  the  Dominating  Set 
problem  (DS)  (known  to  be  NP-complete;  see  [10]).  In  this  problem  we  have  a 
universe  set  U  -  {ui,  . .  . ,  of  m  elements,  a  family  {Ai, .  .  . ,  A;}  of  /  subsets 
of  U  and  an  integer  k  "C  f\  we  want  to  decide  if  there  exist  k  subsets  Aj^  >  •  •  •  >  Aj^ 

which  cover  f/,  i.e.  such  that  |jf_i  Aj^  =  U . 

Starting  from  an  instance  l£)s  of  DS,  we  construct  a  graph  G  that  admits  a 
{3, 1)-RVPL  if  and  only  if  Ids  admits  a  cover. 

Let  G  =  (!/,£'),  where  V  =  {r}UViU{rj}U V2U V3  and  E  =  EiU E2U E3U E4 
(see  Figure  1),  with  Vi  {qa  \  cl  —  1, ...,/?  +  1),  V2  —  I  b  —  1, .  .  . , 

V3  =  {ze  I  c  =  1,  . .  . ,  m},  and  Ei  —  {{r,  qa]  |  a  =  1, 1},  £^2  =  {{^a,  v]  \ 

a  =:  I k  P  1] ,  Es  —  {{u,  rcj}  |  6  =  1, £"4  =  {{'^^6)  ^c]  |  €  Ab}- 


Fig.  1.  The  reduction  graph  for  (3, 1)-RVPL 


We  show  that  if  there  are  k  dominating  sets  Aj^,. .  Aj^,  then  there  exists 
a  (3,  1)-RVPL  E  for  G,  and  that  if  there  are  no  k  dominating  sets,  then  no 
(3,  1)-RVPL  E  for  G  exists.  The  details  are  omitted  in  this  Extended  Abstract. 

□ 


Lemma  9.  The  {2,2)-RVPL  problem  is  NP-complete. 

Sketch  of  proof.  We  prove  the  claim  by  providing  a  polynomial  time  trans¬ 
formation  from  the  3-SAT  problem  (see  [10]).  An  instance  of  this  problem  is 
constituted  by  a  boolean  formula  /  over  m  variables  xi, . .  .,Xm,  where  /  is  in 
conjunctive  normal  form,  i.e.  /  is  the  conjunction  of  g  clauses  ci, . . . ,  c^,  each  of 
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which  is  the  disjunction  of  three  literals.  We  want  to  determine  whether  there 
exists  a  truth  assignment  for  xi, . . . ,  Xm  which  satisfies  /. 

Starting  from  an  instance  of  3-SAT,  we  construct  a  graph  G  that  admits  a 
(2, 2)-RVPL  if  and  only  if  /  is  satisfiable. 

Let  G  =  (V,E),  where  V  =  {r}U  Vi U V2 UI/3U V4U K5,  and  ^  ^iUJ^2U£;3U 

E4UE5UB6  (see  Figure  2),  with  Vi  =  {ua,Ua  |  a  =  1, .  • . ,  m),  V2  =  {^^a5  Va  \  a  ~ 
1,  .  .  .  ,  77?.},  V3  =  {qa  I  ~  !>•••)  ^  ~  —  1,  •  •  •  >  5  ^  —  1?  •  •  •  )  4}, 

V5  {zi,j  \  b  ^  and  Ei  =  {{r,  {r,  n  J  |  a  = 

1,  .  .  ,  ,  777),  E2  —  {{Wa,Fa},  {Ua,  Va}  \  Cl  —  1,  .  .  .  ,  m},  E^  =  {{Wa,  ^a},  {^a,  ^a}  | 

0  =  1,...,  777),  E4  —  {{qay  ^a,i}  \  a  —  \y . .  .  ^  i  ~  j  4},  E^  =  {{'^o  ^b,j}  \ 

a  =  1, . . . ,  777  ,  6  =  1,. .  .,g  ,  j  ~  1, . . . ,  4  ,  6  cj},  Ee  —  {{ya,z:b,j]  \  a  = 

1,  .  .  .  ,  777.  ,  6  =  l,---,g  y  j  =  I,  .  .  .,4  ,  Xa  E  Cb}. 


r 


Zl,l  21,2^1,3^1,4  22,1  22,222,322,4 

f  =  (xiVl^y  X3)  A  (xiV2:2V^) 

Fig.  2.  The  reduction  graph  for  (2,  2)-RVPL 


Informally,  in  G  we  associate  to  each  variable  Xa  a  truth  setting  component 
constituted  by  the  subgraph  induced  by  the  vertices  r,  Ua,  Ua,Va,  Va,  qa, 

U’a,2,  U)a,3  and  Wa,4.  To  explain  the  intuition  for  our  construction,  consider  any 
path  layout  for  the  graph  G.  The  restriction  of  this  layout  to  this  subgraph  can 
be  associated  in  a  natural  way  to  a  truth  assignment  for  Xa-  In  fact,  in  order  for 
r  to  reach  the  four  vertices  ,  Wa,2,  ^^a,3j  ^a,4  in  at  most  two  hops,  the  VPs 
{r,Ua,qa)  or  (r,  Ua,  qa)  must  belong  to  the  RVPL. 

W.l.o.g.  we  can  then  assume  that  either  {r,Ua,qa}  or  {r,Ua,qa}  are  in  the 
RVPL  .  In  the  first  case  the  truth  assignment  associated  to  Xa  is  true,  and  in 
the  second  it  is  false.  If  the  truth  assignment  of  Xa  is  true  (resp.  false),  then  the 
RVPL  can  contain  the  VP  {r,Ua,Va)  (resp.  {r,Ua,Va)),  so  that  all  vertices  Zbj 
corresponding  to  clauses  Cb  containing  Xa  (resp.  Xa)  can  be  reached  in  at  most 
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two  hops,  as  they  are  directly  connected  to  Va  (resp.  (See  Figure  3.) 


Xa  false  Xa  true 

Fig.  3.  Path  layout  and  truth  assignment  in  the  case  of  (2,2)-RVPL 


We  show  (details  omitted  in  this  Extended  Abstract)  that  there  is  a  truth 
assignment  satisfying  /  if  and  only  if  there  exists  a  (2,  2)-RVPL  ^  for  G. 


□ 


Lemma  10.  For  every  I,  the  {2J)-RVPL  problem  ts  NP-compleie. 

Sketch  of  proof.  Given  any  /  >  2,  we  will  prove  that  (2,/)-RVPL  is  an  NP- 
complete  problem  by  a  polynomial  time  transformation  from  the  3- SAT  prob¬ 
lem  which  is  a  generalization  of  the  transformation  presented  in  the  proof  of 
Lemma  9.  Let  an  instance  of  the  3-SAT  problem  be  as  defined  in  the  proof  of 
Lemma  9.  Starting  from  this  instance,  we  construct  a  graph  G  that  admits  a 
(2,  /)-RVPL  if  and  only  if  /  is  satisfiable. 

The  idea  is  to  add  a  construction  to  each  of  the  vertices  Ua  and  Ua  which  will 
force  an  addition  of  /  -  2  VPs  on  each  of  the  edges  {r,Ua}  and  {r,f7a}  (in  order 
to  reach  all  vertices  in  the  new  construction  in  2  hops).  In  addition,  we  have  to 
enlarge  the  number  of  Wa,i  vertices  (actually  we  will  have  2/  such  vertices  for 
every  variable  a  in  /),  and  the  number  of  vertices  to  3(/  —  1)  -f  1  for  each 
clause  b  in  the  formula  /.  Note  that  for  the  special  case  /  —  2  we  will  get  exactly 
the  same  construction  as  in  the  proof  of  Lemma  9. 

Formally,  we  specify  only  the  additions  to  the  construction  of  Lemma  9.  We 
add  the  following  sets  of  vertices  Ve  =  {sa,n  Sa,i  \  a  —  1,  . .  .,m,i  =  L  •  •  • ,  ^  —  2), 
Vj  =  j  a  =  l,...,m,  i  =  l,...,l-2,j  =  1,...,!},  a,ndthe  following 

sets  of  edges:  Fj  —  {{ua,  {ua,  Sq,*}  j  a  =  1, . . . ,  m  ,  z  =  1, . . . ,  /  -  2}, 

Fg  —  { {^0,1 )  f } 5  {^a,i 5  }  |  a  —  1 , . . . ,  m  ,  i  1 , . . . ,  /  2  ,  j  1 . 

We  enlarge  the  number  of  Wa^i  and  Zbj  vertices  as  follows:  V4  —  |  a  = 

1, . . rn  ,  z  =  1, . . . ,  2/},  V5  =  {zbj  \  b  =  I, . . . ,  g  ,  j  -  1, . . . ,  3(/  -  1)  +  1},  and 
we  correspondingly  enlarge  the  number  of  edges  in  the  sets  F4,  F^  and  Fq  as 
follows:  F4  =  'Wa,i}  I  a  =  1, . . . ,  m  ,  z  =  1,  •  •  • ,  2/},  F^  =  {{^o,  I  «  = 
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1,  .  .  .  ,  ?T?.  ,  6  ==  1,  .  .  .  ,  j  ~  1,  .  .  .  ,  3(/  —  1)  +  1  ,  G  Ct},  E'e  —  {{^a,  ^b,j}  \  <2  — 
1,  .  .  .  ,  777  ,  b  =  1,  .  .  .  ,  ,  j  =  1,  .  .  .  ,  3(/  -  1)  +  1  ,  Xa  €  Cfc}. 

Clearly  to  reach  the  ta,i,j  and  ta,i,j  vertices  in  two  hops,  we  must  reach  each 
of  the  Sa,i  and  vertices  in  one  hop,  which  uses  1  —  2  VPs  on  each  of  the  edges 
{7’,  77a}  and  {r,77a}. 

The  rest  of  the  proof  is  a  generalization  of  the  proof  of  Lemma  9,  and  is 
omitted  in  this  Extended  Abstract.  ° 


Lemma  11.  For  every  h  and  I,  if  the  {hJ)-RVPL  'problem  is  NP-compleie  then 
{h  +  IJ)-RVPL  is  also  an  NP-complete  problem. 

Sketch  of  proof.  We  assume  that  {hJ)-RYPL  is  NP-complete  and  prove  that 
(/?.+  1,  /)-RVPL  is  also  NP-complete  by  a  polynomial  transformation  from  {h,  /)- 
RVPL.  Given  an  instance  of  (h,  /)-RVPL,  a  graph  G  —  (V,  E)  and  a  vertex  r  E  V, 
we  construct  a  graph  G^  —  E^')  and  a  vertex  r*  E  such  that  there  exists  a 

{h  T  1,  /)-RVPL  for  G'  if  and  only  if  there  exists  a  (h,  /)-RVPL  for  G.  For  every 
vertex  in  V,  let  deg{v)  be  the  degree  of  the  vertex  (i.e.,  the  number  of  vertices 
adjacent  to  7;  in  G).  The  graph  G'  is  constructed  from  G  by  adding  deg(v)  ■  I 
new  vertices  to  each  vertex  7;  in  G,  and  connecting  each  of  them  to  v.  Formally, 
V'  =  V  U  {77;^,*  I  V  e  V,i  =  1,...,  deg(v)  -Ij,  E^  =  EU  {{7;,  J  |  7;  E  V,  i  = 
l,...,d€g(v)  1). 

The  root  r'  of  G'  is  the  vertex  r  E  We  term  the  vertices  and  edges  of  G 
in  G'  original  and  the  rest  of  the  vertices  and  edges  in  G'  new.  Obviously  the 
transformation  is  polynomial  in  the  size  of  the  input  graph  G. 

Assume  that  there  is  an  {h,  /}-RVPL  ^  for  G  with  root  r.  To  get  an  (h  -f  1,  />- 
RVPL  F'  for  G'  with  root  r'  we  add  to  ^  the  VPs  of  length  1  from  every  7;  E  V 
to  every  Wy^i.  It  can  also  be  shown  (detailed  omitted  here)  that  if  there  is  an 
(/,,  +  l,/)-RVPL  E’  for  G'  with  root  r',  then  in  for  every  original  vertex  v, 
n{v)  <  h  and  thus  0^’  induces  an  (h,  /)-RVPL  for  G  in  the  natural  way  (remove 
from  F'  all  VPs  with  an  endpoint  which  is  a  new  vertex). 

□ 

Sketch  of  proof,  [of  Theorem  7]  We  prove  that  for  every  h  and  I  except  for  the 
cases  h  =  l,  any  /,  and,  h  =  2,1=1  the  {h,  /)-RVPL  problem  is  NP-complete  by 
induction  on  h.  The  basis  is  established  in  Lemmas  8,  9,  and  10,  where  we  prove 
that  the  problems  {2,  /)-RVPL  for  every  I,  and  (3, 1}-RVPL  are  NP-complete.  The 
induction  step  is  established  in  Lemma  1 1 ,  where  we  prove  that  for  every  h  and 
/,  the  NP-completeness  of  the  {h,  /)-RVPL  problem  derives  the  NP-completeness 
of  {h  +  l,/)-RVPL.  ° 


4  Polynomial  Cases 

In  this  section  we  show  that  the  above  NP-completeness  results  are  strict,  by 
giving  polynomial  running  time  algorithms  for  the  (2, 1)-RVPL  problem  and  the 
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(1,  /}-RVPL  problems  for  any  /  >  1.  We  do  this  by  applying  algorithms  to  find 
flow  in  networks,  which  are  known  to  be  polynomial  (e.g.,  [9,  8,  7]). 

Given  a  directed  graph  G  =  (y,E)^  with  capacities  c(e)-  positive  integers  - 
for  the  edges  e  £  E,  and  two  specified  vertices  s  and  t,  we  want  to  find  a  flow 
of  maximum  total  value  from  s  to  t.  It  is  well-known  that  in  the  case  of  unit 
capacities  there  is  a  flow  of  value  k  from  s  to  t  in  G  iff  there  are  k  edge-disjoint 
paths  connecting  s  and  t,  and  that  this  holds  also  in  a  general  network  with 
integral  capacities,  provided  that  each  edge  e  is  replaced  by  x  parallel  edges  of 
unit  weight  each,  where  x  is  the  original  capacity  of  e. 

Given  a  graph  G  =  {V,  E)  and  a  specified  vertex  r,  to  construct  a  (1,  /)-RVPL 
for  it,  we  construct  the  graph  G'  =  {V' ,  E'),  as  follows.  V'  —  V  U  {t} .  For  E'  we 
construct  a  shortest-path  BFS  graph,  rooted  at  r;  this  gives  a  directed  layered 
graph  (whose  layers  are  identical  to  those  constructed  by  the  Dinic’s  Algorithm; 
see  [8,  7]);  The  vertices  in  layer  i,i  >  0  are  exactly  the  vertices  in  V  whose 
distance  from  r  is  exactly  i.  There  are  no  edges  within  a  layer,  and  all  edges  are 
from  layer  i  to  layer  i  -H  1,  for  some  i  >  0.  All  these  edges  have  a  capacity  of  /. 
We  then  add  all  the  edges  {vG)  for  every  vertex  in  R  —  {r},  with  capacity  1. 
The  source  and  destination  of  G'  are  r  and  t,  respectively.  (See  Figure  4(b)). 


(a)  Initial  network  G  (b)  (1,/)  —  RVPL  (c)  (2,1)  —  RVPL 

Fig.  4.  The  flow  constructions 

We  then  run  any  algorithm  to  determine  the  maximum  flow  in  this  network. 
By  the  above,  there  is  a  flow  of  value  |I/|  —  1  in  G'  iff  there  are  paths  in  G 
from  r  to  all  other  vertices  in  \V\,  such  that  no  edge  is  used  in  more  than  /  of 
them,  which  means  that  there  is  a  solution  to  the  {1,  /)-RVPL  problem  iff  there 
is  a  flow  of  value  at  least  \V\  —  1  in  G',  which  thus  supplies  a  polynomial- time 
algorithm  for  the  (l,/}-RVPL  problem. 
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Given  a  graph  G  —  [V,E)  and  a  specified  vertex  r,  to  construct  a  (2,  1)- 
RVPL  for  it,  we  construct  the  following  graph  G'  =  {V' ,  E').  V'  —  V  U  {t},  and 
E'  is  defined  as  follows.  Let  U  denote  the  set  of  neighbors  of  r  in  G.  For  E'  we 
construct  a  shortest-path  BFS  graph,  rooted  at  r  (as  above).  This  partitions  the 
vertices  of  V  into  layers,  such  that  the  vertices  in  layer  i,  i  >  0  are  exactly  the 
vertices  in  V  whose  distance  from  r  is  exactly  i.  As  above,  there  are  no  edges 
within  a  layer,  and  all  edges  are  from  layer  i  to  layer  i  +  1,  for  some  i  >  0. 
We  then  add  to  E'  all  the  edges  (v,t)  for  every  vertex  v  in  V  -  {U  U  {r}).  All 
the  edges  in  E'  have  a  capacity  of  1,  except  for  the  edges  emanating  from  r, 
whose  capacity  is  |K]  —  |C/|  —  1.  The  source  and  destination  of  G'  are  r  and  t, 
respectively.  (See  Figure  4(c)).  It  can  be  shown  that  there  is  a  (2, 1}-RVPL  iff 
the  maximum  flow  in  the  network  is  equal  to  \  V\  —  \U\  —  1,  and  thus  the  problem 
is  solved  in  polynomial  time  by  solving  the  corresponding  flow  problem. 

Note  that,  in  the  case  of  h  =  1  and  arbitrary  /,  if  we  run  the  network  flow 
algorithm  on  the  original  network  to  which  t  is  added  as  above,  and  where 
each  edge  is  replaced  with  two  anti-parallel  edges  (rather  than  using  the  layered 
network)  and  capacities  are  similarly  defined,  we  can  determine  whether  a  layout 
exists,  but  with  an  arbitrary  stretch  factor. 


5  Summary  and  Open  Problems 

We  have  considered  a  routing  problem  termed  the  “rooted  VP  layout  problem” 
that  arises  in  ATM  networks  and  we  have  investigated  the  computational  com¬ 
plexity  of  determining  the  existence  of  RVPL  fulfilling  a  maximum  hop  count 
h  and  a  maximum  load  /.  We  have  shown  that  deciding  the  existence  of  such 
layouts  is  NP-complete  for  all  values  of  h  and  /,  except  for  the  cases  h  —  2J-  1 
and  h—  1,  any  /,  for  which  we  presented  polynomial-time  layout  constructions, 
based  on  network  flow  algorithms. 

In  classical  graph  embedding  problems  vertices  are  mapped  to  vertices  and 
edges  are  mapped  to  paths  connecting  the  endpoints  of  their  corresponding  ver¬ 
tices;  this  is  a  very  common  situation  in  embeddings  within  a  VLSI  networks. 
In  this  context,  the  term  dilation  is  used  to  denote  the  longest  path  onto  which 
an  edge  is  embedded.  Since  in  our  constructions  for  the  NP-complete  results  the 
virtual  paths  were  of  length  at  most  two,  it  follows  that  the  above  two  problems 
remain  NP-complete  for  any  given  bound  on  the  dilation. 

An  open  problem  is  to  extend  these  results  to  many-to-many  virtual  path 
layouts,  where  we  are  interested  to  connect  all  pairs  of  vertices  with  virtual  paths 
under  similar  constraints,  or  to  other  cases  when  the  pairs  to  be  connected  are 
specified. 

A  more  difficult  problem  seems  to  be  the  one  in  which  not  only  shortest 
path  layouts  are  considered,  but  also  layouts  that  are  within  a  given  stretch 
factor  /  (that  is,  one  in  which  the  virtual  channel  between  the  desired  vertices  is 
bounded  by  /  times  the  shortest  path  between  these  vertices).  Our  polynomial¬ 
time  algorithms  do  not  apply  for  a  given  stretch  factor  (though,  as  we  noted,  we 
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can  use  simpler  algorithms  for  the  case  /i  =  1,  any  I,  under  the  assumption  of 
an  arbitrary  stretch  factor). 

Acknowledgment:  We  thank  Shlomo  Moran  for  very  helpful  comments. 
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Efficiency  of  Asynchronous  Systems 
and  Read  Arcs  in  Petri  Nets 


Walter  Vogler  *,  Universitat  Augsburg,  Germany 


Abstract 

Two  solutions  to  the  MUTEX-problem  are  compared  w.r.t.  their  tem¬ 
poral  efficiency.  For  this,  a  formerly  developed  efficiency-testing  for  asyn¬ 
chronous  systems  is  adapted  to  nets  with  so-called  read  arcs.  The  close  re¬ 
lation  between  efficiency-testing  and  fairness  is  pointed  out,  and  it  is  shown 
that  read  arcs  are  necessary  for  any  solution  to  the  MUTEX-problem. 

1  Introduction 

The  testing  scenario  of  [DNH84]  has  been  developed  further  in  [Vog95b,  JV96]  in 
order  to  compare  the  temporal  efficiency  of  asynchronous  systems  -  using  Petri 
nets  as  system  models.  This  approach  is  applied  here  to  two  solutions  of  the 
MUTEX-problem  based  on  token  passing.  The  corresponding  nets  contain  what 
we  call  read  arcs,  and  one  of  our  main  results  is  that  this  is  in  fact  necessary. 


c 


Figure  1 

In  Petri  nets,  the  check  of  a  side-condition  is  modelled  with  a  loop  as  in 
Figure  1:  the  occurrence  of  i  removes  the  condition  c  and  restores  it  afterwards; 
hence,  t  and  t’  can  occur  in  any  order,  but  not  at  the  same  time.  This  is  certainly 
adequate  if  e.g.  c  models  the  processor  that  t  and  f  run  on.  But  if  e.g.  c  is  a 
value  from  a  data  base  which  can  be  read  concurrently,  then  t  and  t'  can  occur 
at  the  same  time.  We  model  such  cases  with  special  read  arcs  instead  of  loops. 

Read  arcs  have  not  found  so  much  attention  in  the  past,  probably  because 
loops  and  read  arcs  are  treated  just  the  same  if  we  only  look  at  interleaving 
semantics.  But  they  do  make  a  difference  when  we  explicitly  take  into  account 
concurrency.  E.g.  [CH93]  discusses  a  step  semantics  and  [MR95]  defines  net- 

■"This  work  was  partially  supported  by  the  DFG-project  ‘Halbordnungstesten*.  Author’s 
address:  Institut  fiir  Informatik,  Universitat  Augsburg,  D-86135  Augsburg,  Germany,  email: 
vogler@informatik.uni-augsburg.de 
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processes  for  nets  with  read  arcs.  In  both  approaches,  a  net  with  read  arcs  can 
be  translated  to  an  equivalent  net  without,  but  it  is  argued  in  [MR95]  that  the 
former  is  more  natural  and  compact.  In  clear  contrast,  read  arcs  are  even  better 
motivated  in  our  setting,  since  they  add  relevant  expressivity:  the  MUTEX- 
problem  can  only  be  solved  with  nets  having  read  arcs;  this  also  holds,  if  we 
disregard  efficiency  and  simply  take  fair  behaviour  as  a  basis. 

In  the  testing  approach  of  [DNH84],  a  system  is  an  implementation  if  it  per¬ 
forms  in  all  environments,  i.e.  for  all  users,  just  as  well  as  the  specification.  While 
in  the  classical  setting  successful  performance  only  depends  on  the  functionality, 
i.e.  which  actions  are  executed,  the  testing  approach  was  refined  in  [Vog95b]  to 
consider  also  efficiency.  The  must-version  of  this  efficiency  testing  (concerned 
with  worst  case  behaviour)  is  not  so  easy  to  define  in  the  case  of  asynchronous 
systems,  where  the  components  work  with  indeterminate  relative  speeds;  most 
often,  this  is  interpreted  as  ‘each  component  may  work  arbitrarily  slow’.  With 
this  view,  the  worst  case  is  simply  that  nothing  is  done  for  a  long  time,  hence 
every  test  is  failed  and  we  do  not  have  a  sensible  theory  of  testing. 

As  a  way  out,  [JV96]  assumes  that  each  action  is  performed  within  one  unit  of 
time  (or  is  disabled  within  this  time).  Such  an  upper  time  bound  is  a  reasonable 
basis  for  judging  the  efficiency;  since  actions  can  also  be  performed  arbitrarily 
fast,  the  components  work  with  indeterminate  relative  speeds  even  under  this 
assumption,  and  we  have  a  valid  theory  for  asynchronous  systems.  It  turns  out 
that,  for  the  resulting  testing  scenario,  the  implementation  preorder  is  a  sensible 
faster- than  relation.  Three  variants  based  on  dense  time  are  considered  and  each 
of  them  is  shown  to  coincide  with  a  discretely  timed  version.  In  the  most  simple 
variant,  which  we  will  generalize  here  to  nets  with  read  arcs,  transitions  must 
fire  within  time  1  after  enabling,  but  the  firing  itself  is  instantaneous. 

After  defining  some  basic  concepts  in  Section  2,  we  define  our  asynchronous 
firing  rule  in  Section  3  and  present  a  characterization  of  the  faster-than  relation 
that  results  from  testing;  also,  the  use  of  loops  is  discussed.  Section  4  shows  the 
close  relation  between  efficiency  testing  and  fairness  (in  the  sense  of  progress) 
demonstrating  that  our  efficiency  testing  is  concerned  with  asynchronous  be¬ 
haviour;  it  is  also  described  how  to  determine  the  fair  behaviour  of  a  composed 
system  in  a  modular  fashion.  The  two  MUTEX-solutions  with  read  arcs  are 
studied  in  Section  5.  We  view  a  MUTEX-solution  as  a  scheduler,  i.e.  an  inde¬ 
pendent  component  the  users  have  to  synchronize  with.  This  view  allows  a  clean 
formulation  of  the  correctness  requirements  and  fits  very  well  the  behaviour  no¬ 
tions  we  have  given  in  Sections  3  and  4;  we  prove  the  correctness  of  one  of  our 
solutions  and  then  show  that  no  ordinary  net  without  read  arcs  can  be  correct 
in  this  sense.  Finally,  we  show  that,  from  the  point  of  view  of  one  user,  one 
solution  is  more  efficient  than  the  other. 

Due  to  lack  of  space,  the  proofs  had  to  be  omitted;  see  [Vog96],  also  for  a 
discussion  of  the  literature  on  the  efficiency  of  asynchronous  systems.  I  thank 
Roberto  Gorrieri  and  Lars  Jenner  for  their  comments,  which  helped  to  improve 
the  presentation  of  this  paper. 
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2  Basic  Notions  of  Petri  Nets  with  Read  Arcs 

We  use  safe  nets  (extended  with  read  arcs)  whose  transitions  are  labelled  with 
actions  from  some  infinite  alphabet  E  or  with  the  empty  word  A,  indicating 
internal,  unobservable  actions.  E  contains  a  special  action  w,  which  we  will  need 
in  our  tests  to  indicate  success. 

Thus,  a  net  N  =  {S,T,  M^)  consists  of  finite  disjoint  sets  5  of  places 

and  T  of  transitions,  the  flow  F  C  S  x  TUT  X  S  consisting  of  (ordinary)  arcs, 
the  set  of  read  arcs  R  C  S  x  T  UT  X  S,  the  labelling  /  :  T  ^  E  U  {A},  and  the 
initial  marking  Mn  :  S  ->  {0,1};  R  is  always  symmetric  with  RDF  —  As 
usual,  we  draw  transitions  as  boxes,  places  as  circles  and  arcs  as  arrows;  read 
arcs  are  drawn  as  lines  without  arrow  heads,  i.e.  we  identify  the  two  elements 
{x,y),{y,x)  G  ii.  The  net  is  called  ordinary,  if  =  0. 

For  each  a;  E  5  U  T,  the  (full)  preset  of  a;  is  ‘x  =  {y  |  (y,  aj)  E  -f’  U  i?}  and 
the  (full)  postset  of  x  is  x*  ^  {y  \  {x,y)  e  F  U  i?};  the  reduced  preset  of  x  is 

—  {y  I  {y,x)  E  F’}  and  the  reduced  postset  of  x  is  x°  =  {y  |  (x,y)  E  F’}. 
If  X  E  °y  n  y°,  then  x  and  y  form  a  loop.  A  marking  is  a  function  S  Nq. 
We  sometimes  regard  sets  as  characteristic  functions,  which  map  the  elements 
of  the  sets  to  1  and  are  0  everywhere  else;  hence,  we  can  e.g.  add  a  marking  and 
a  postset  of  a  transition  or  compare  them  componentwise. 

Our  basic  firing  rule  extends  the  firing  rule  for  ordinary  nets  by  regarding 
the  read  arcs  as  loops,  i.e.  as  ordinary  arcs  (since  R  is  symmetric).  A  transition 
t  is  enabled  under  a  marking  M,  denoted  by  if  <  M.  If  M\t)  and 

M'  =  M  -\- 1*  —  (which  is  the  same  as  Af  -f  —  °t),  then  we  write  M\t)M' 
and  say  that  t  can  occur  or  fire  under  M  yielding  the  follower  marking  M'. 

Enabling  and  occurrence  is  extended  to  sequences  as  usual.  If  u;  E  T*  is 
enabled  under  Afjsr,  it  is  called  a  firing  sequence.  We  extend  the  labelling  to 
sequences  of  transitions  as  usual,  i.e.  homomorphically;  thus,  internal  actions 
are  deleted  in  this  image  of  a  sequence.  With  this,  we  lift  the  enabledness  and 
firing  definitions  to  the  level  of  actions:  a  sequence  v  of  actions  is  enabled  under 
a  marking  M,  denoted  by  M[v)),  if  M[w)  and  l{w)  =  v  for  some  w  £  T*.  If 
M  =  M]^,  then  v  is  called  a  trace;  the  set  of  traces  is  the  language  of  N. 

A  marking  M  is  called  reachable  if  Mn-[w)M  for  some  w  £  T*.  The  net  is 
safe  if  M{s)  <  1  for  all  places  s  and  reachable  markings  M. 

General  assumption:  All  nets  considered  in  this  paper  are  safe  and  only  have 
transitions  t  with  °t  ^  0.  (The  latter  condition  is  no  serious  restriction,  since  it 
can  be  satisfied  by  adding  a  loop  between  t  and  a  new  marked  place,  if  were 
empty  otherwise;  this  addition  does  not  change  the  firing  sequences.) 

We  use  a  TCSP-like  parallel  composition  and  write  ||  for  ||s-{a>}’  Nets 
combined  with  ||^  run  in  parallel  and  have  to  synchronize  on  actions  from  A. 
To  construct  Ni  \\a  N2,  we  take  the  disjoint  union  of  Ni  and  N2,  combine  each 
a-labelled  transition  ti  of  Ni  with  each  o-labelled  transition  ^2  from  iV2  if  a  E  A 
(i.e.  introduce  a  new  a-labelled  transition  (^1,^2)  that  inherits  all  arcs  from  ti 
and  ^2)5  and  delete  all  the  original  a-labelled  transitions  in  Ni  and  7^2  if  a  E  A. 
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3  Timed  Behaviour  of  Asynchronous  Systems 

We  now  describe  the  asynchronous  behaviour  of  a  parallel  system,  taking  into 
account  at  what  times  things  happen.  Hence,  the  components  of  the  system 
vary  in  speed  —  but  we  assume  that  they  are  guaranteed  to  perform  each  enabled 
action  within  at  most  one  unit  of  time;  this  upper  time  bound  allows  the  relative 
speeds  of  the  components  to  vary  arbitrarily,  since  we  have  no  positive  lower  time 
bound.  Thus,  the  behaviour  we  define  is  truly  asynchronous. 

For  ordinary  nets,  [JV96]  bases  a  testing  preorder  on  such  an  asynchronous 
firing  rule  using  dense  time,  shows  that  one  can  just  as  well  use  discrete  time,  and 
gives  a  characterization  of  the  testing  preorder.  These  results  can  be  generalized 
to  nets  with  read  arcs  [Vog96];  here,  we  immediately  define  an  asynchronous 
firing  rule  using  discrete  time  and  present  the  respective  characterization. 

Due  to  the  time  bound  1,  a  newly  enabled  transition  fires  or  is  disabled  within 
time  0  -  or  it  becomes  urgent  after  one  time-unit  (denoted  by  cr),  i.e.  it  has  no 
time  left  and  must  fire  or  must  be  disabled  before  the  next  a. 

The  crucial  point  of  read  arcs  is  that  they  differ  from  loops  w.r.t,  disabling. 
If  we  have  a  loop  (c,t),  (t,c)  and  an  arc  or  read  arc  {c,t')  for  a  place  c  and 
urgent  transitions  t  and  t’  (see  Figure  1),  then  firing  t  removes  the  token  from  c 
and,  thus,  disables  momentarily;  hence,  is  not  urgent  any  more.  If,  instead, 
(c,  i)  and  (f,  c)  form  a  read  arc,  t  just  checks  for  the  presence  of  a  token  without 
removing  it  and,  thus,  is  not  disabled  and  remains  urgent;  hence,  t  and  t’  will 
occur  faster  -  and  this  is  what  we  should  expect  since  t  does  not  block  t'. 

Definition  3.1  An  instantaneous  description  ID  =  (Af,  U)  consists  of  a  mark¬ 
ing  M  and  a  set  U  of  urgent  transitions.  The  initial  ID  is  IDu  —  {Mn  j  f^JV )  with 
UN  =  {t\  MN[t)}.  We  write  (M,  U)[e){M\  U')  in  one  of  the  following  cases: 
h  e  =  teT,  Af  U'  =  U-  {t’\H  n  0}) 

2.  5  =  C7,  M  =  Af^  C/  =  0,  =  {t  I  M[t)} 

DFS{N)  -  {w  I  IDn[w)  ID}  is  the  set  of  discretefly  timed)  firing  sequences 
of  N,  DL{N)  =  {^(it;)  |  w  £  DFS(N)}  is  the  discrete  language  of  N  containing 
the  discrete  traces  of  iV,  where  l{o')  =  cr .  For  w  G  DFS{N)  oi  w  £  DL(N)^  C('^) 
is  the  number  of  a’s  in  ly.  The  behaviour  inbetween  two  a' s  is  called  a  round. 

We  call  a  net  testable,  if  none  of  its  transitions  is  labelled  with  u.  A  testable 
net  N  satisfies  a  timed  test  (O,  D),  N  must  (O,  D),  if  each  w  £  DL{N\\0)  with 
C(iy)  >  D  contains  some  w;  we  call  a  net  Ni  faster  than  a  net  N2,  Ni  '3  N2,  if 
for  all  (O,  D)  we  have  N2  must  {0,D)  Ni  must  {0,D).  O 

Part  1  allows  enabled  transitions  -  urgent  or  not  -  to  fire;  hence,  DL{N) 
includes  the  language  of  N  and  describes  an  asynchronous  behaviour.  C/  =  0  in 
Part  2  requires  that  no  urgent  transition  is  delayed  over  the  following  a.  Each 
enabled  transition  is  urgent  after  a.  Thus,  a  discrete  trace  is  any  ordinary  trace 
subdivided  into  rounds  by  a’s  such  that  no  transition  enabled  at  (i.e.  immediately 
before)  one  cr  is  continuously  enabled  until  after  the  next  cr . 

The  definitions  for  testing  are  standard  except  for  the  time  bound,  where 
we  require  that  every  run  of  the  system  embedded  in  the  test  environment  is 


successful  within  time  D\  hence,  we  do  not  consider  traces  that  do  not  last  for 
time  D.  We  call  the  implementation  Ni  faster^  since  it  might  satisfy  more  tests 
and,  in  particular,  some  test  nets  within  a  shorter  time. 

The  test-preorder  □  formalizes  observable  difference  in  efficiency;  refering  to 
all  possible  tests,  it  is  not  easy  to  work  with  directly.  Thus,  we  now  characterize 
□  by  so-called  i-refusal  traces  [JV96]:  we  replace  the  cr’s  in  a  discrete  trace  by 
sets  of  actions,  indicating  the  time-steps  now.  Such  a  set  contains  actions  that 
are  not  urgent,  i.e.  can  be  refused  when  the  time-step  occurs. 

Definition  3.2  For  discrete  instantaneous  descriptions  (Af,  U)  and  (Af',  U')  we 
write  (Af,  U)[£)r{M\  U')  if  one  of  the  following  cases  applies: 

1.  e  =  t  e  T,  M[t)M\  U'  =  U-  {t^\H  fl  V'  /  0}) 

2.  £  =  X  CE,  M  =  M',  Cf'  rr  {t  I  M[t}},  yteU:  l{t)  f  XU  {A};  X  is  a  refusal 
set. 

The  corresponding  i-refusal  firing  sequences  form  the  set  RFS(N).  RT{N)  = 
{l{w)  I  w  e  RFS{N)}  is  the  set  of  i-refusal  traces  where  1{X)  =  X.  □ 

Occurrence  of  S  exactly  corresponds  to  that  of  cr,  hence: 

Prop.  3.3  For  nets  Ni  andN2,  RT{Ni)  C  RT{N2)  implies  DL{Nx)  C  DL{N2). 

We  will  show  later  that  read  arcs  add  relevant  expressivity;  here,  we  state 
that  ordinary  loops  are  in  fact  not  needed  in  nets  with  read  arcs. 

Prop.  3.4  For  each  net  N,  there  is  a  loopless  net  N'  with  RT[N)  —  RT{N'). 

Still,  loops  are  certainly  often  adequate:  if  two  activities  run  on  the  same 
processor,  they  cannot  occur  together;  if  one  takes  place,  the  other  has  to  wait 
a  little  -  and  this  is  just  how  we  treat  two  transitions  with  a  common  loop-place 
here.  Also,  our  construction  for  3.4  makes  nets  possibly  exponentially  larger. 
Finally,  on  the  level  of  discrete  firing  sequences,  loops  have  expressivity  of  their 
own,  since  no  net  without  loops  has  the  same  discrete  firing  sequences  as  the 
one  shown  in  Figure  1: 

Prop.  3.5  If  N  is  a  loopless  net  such  thatt^tt’  G  DFS{N)f  thenta  ^  DFS{N). 

To  show  that  i2T-semantics  induces  a  congruence  for  \\ai  one  defines  ||^  for 
i-refusal  traces:  actions  from  A  are  merged,  while  others  are  interleaved;  refusal 
sets  are  combined  as  in  ordinary  failure  semantics. 

Definition  3.6  Let  u, -y  G  (S  UP(S))*,  ACE.  Then  u  v  is  the  set  of  all 
ly  G  (S  U  'P(I)))*  such  that  for  some  n  we  have  u  —  Ui. .  .Un^  v  =  vi . ,  .Vn, 

w  =  wi . . .  Wn  and  for  i  =  1, . . . ,  n  one  of  the  following  cases  applies: 

~  Ui  =  Vi  =  Wi  E  A 

-  Ui  —  Wi  e  (E  —  A)  and  Vi  =  A,  or  v*  =  ly,  G  (H  —  A)  and  Ui  =  X 

-  Ui,  ViyWi  C  E  and  Wi  C  ((ui  U  Vi)  fl  A)  U  (ui  fl  Vi)  □ 

Theorem  3.7  gives  us  one  half  of  the  characterization  in  3.8. 

Theorem  3.7  For  ACE  and  nets  Ni  and  N2,  we  have  that  i2T(Wi ||^A^2)  == 
u  {u\\av  I  e  RT{m),  V  G  RT{N2)}. 
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Theorem  3.8  For  testable  nets,  Ni  □  N2  if  and  only  if  RT{Ni)  C  RT{N2). 

Observe  that  a  faster  system  has  less  i-refusal  traces;  such  a  trace  is  a  witness 
for  slow  behaviour,  it  is  something  ‘bad’  due  to  the  refusal  information. 

Corollary  3.9  Inclusion  of  RT-semantics  is  fully  abstract  w.r.t.  inclusion  of 
DL-semantics  and  parallel  composition,  i.e.  it  is  the  coarsest  precongruence  for 
parallel  composition  that  respects  DL-inclusion. 

Theorem  3.8  essentially  reduces  □  to  an  inclusion  of  regular  languages,  which 
implies  decidability.  The  testing  preorder  □  is  also  compatible  with  some  other 
interesting  operations,  namely  relabelling,  hiding  and  restriction. 


4  Efficiency  Testing  and  Fairness 

Now  we  relate  our  notion  of  asynchronous  behaviour  to  (weak)  fairness  (or 
progress  assumption);  at  the  same  time,  we  study  compositionality  for  fair  be¬ 
haviour.  Fairness  requires  that  a  continuously  enabled  activity  should  eventually 
occur;  in  real  life,  this  is  automatically  true,  i.e.  it  does  not  have  to  be  imple¬ 
mented.  First,  we  extend  the  definition  of  the  various  firing  sequences  to  infinite 
sequences  taking  into  account  that  an  infinite  run  should  take  infinite  time. 

Definition  4.1  An  infinite  sequence  is  a  (discrete /i-refusal)  firing  sequence  if 
all  its  finite  prefixes  are  (discrete /i-refusal)  firing  sequences. 

A  progressing  (i-refusal)  firing  sequence  is  an  infinite  discrete,  i-refusal  resp., 
firing  sequence  with  infinitely  many  cr’s,  sets  resp.  The  images  of  these  sequences 
are  the  progressing  (refusal)  traces,  forming  PL{N),  PRT{N)  resp. 

For  a  progressing  (refusal)  trace  v,  a('y)  denotes  the  sequence  of  actions  in 
V,  which  remains  after  removing  all  (j’s,  sets  resp.  D 

PRT -{PL-)sema,ntics  extends  i2T-(DIr)semantics  to  infinite  runs,  required  to 
take  infinite  time.  Using  Konig’s  Lemma,  one  can  show  that  nets  have  the  same 
PRT-[oi  PL-)semantics  if  and  only  if  they  have  the  same  RT-(oi  jDi/-)semantics. 

Classically,  an  infinite  firing  sequence  Miv[to)-^i[fi)Af2  . . .  would  be  called 
fair  if  we  have:  if  some  transition  t  is  enabled  under  all  Mi  for  i>  then  t  —  U 
for  some  i>  j\  hence,  an  infinite  sequence  of  P's  would  not  be  fair  in  the  net  of 
Figure  1,  since  i  is  enabled  under  all  states  reached,  but  never  occurs.  But  the 
sequence  should  be  fair:  t  is  not  continuously  enabled,  since  every  occurrence  of 
P  disables  it  momentarily,  compare  [Rei84,  Vog95a].  Thus,  we  will  require  that 
t  is  enabled  also  while  each  U  with  i  >  j  is  firing.  For  this,  we  have  to  keep  in 
mind  that  a  read  arc  does  not  consume  a  token. 

Definition  4.2  For  a  transition  t,  a  finite  firing  sequence  Mjv[to)Afi[ti) . . ,  Mn 
is  t-fair,  if  not  Mn[t).  An  infinite  firing  sequence  MN[to)Mi[ti)M2  ...  is  t-fair,  if 
we  have:  if  t  is  enabled  under  all  Mi  —°ti  for  i  greater  than  some  j,  then  t  —  p 
for  some  i  >  j.  A  finite  or  infinite  firing  sequence  is  fair,  if  it  is  t-fair  for  all 
transitions  t.  The  fair  language  of  N  is  Fair{N)  =  {v\v  =  l{w)  for  some  fair 
firing  sequence  ly}.  ^ 
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Now  we  establish  a  first  relation  of  our  approach  to  fairness:  PL{N),  the 
infinite  version  of  DL{N)^  describes  an  asynchronous  behaviour  just  as  Fair{N). 

Theorem  4.3  For  all  nets  N,  Fair{N)  =  {v  |  E  PL{N)  :  v  =  a(u)  } 

Next,  we  determine  the  coarsest  precongruence  refining  fair-language  inclu¬ 
sion,  something  that  is  needed  when  systems  are  constructed  bottom-up  with  \  \a> 
Theorem  4.5  was  first  obtained  in  [Gol88].  We  improve  the  original  results  by 
allowing  read  arcs  and  loops;  also,  Gold  considered  safe  nets  where  always  0 
-  as  we  do  -,  but  allowed  unsafe  nets  with  isolated  transitions  as  environments 
in  the  proof  of  4.5  iii);  this  is  improved,  too. 

Definition  4.4  For  a  net  N,  define  the  fair  failure  semantics  by  !FF{N)  = 
{(v,  X)  I  X  C  S  and  v  =  l(w)  for  some,  possibly  infinite,  firing  sequence  w  that 
is  t-fair  for  all  transitions  t  with  l{t)  E  X  U  {A}}.  □ 

The  intuition  for  (v,  X)  E  F!F{N)  is  that  all  actions  in  X  can  be  refused  when 
V  is  performed  -  in  the  sense,  that  fairness  does  not  force  additional  performance 
of  these  actions. 

Theorem  4.5  i)  For  all  nets  N,  Fair[N)  =  {v  |  (v,  S)  E  FT{N)}. 

a)  For  ACY,  and  nets  Ni  and  N2,  F!F{N\\\aJ^2)  —  I  3(iUi,Xi)  E 

FF{Ni),  i  =  1,  2  :  u;  E  wi\\aW2  and  X  C  ((Xi  U  X2)  H  A)  U  (Xi  fi  X2)}. 

iii)  Inclusion  of  FT -semantics  is  fully  abstract  w.r.t.  fair-language  inclusion 
and  parallel  composition  in  the  sense  of  Corollary  3.9. 

This  result  and  the  following,  second  relation  to  our  testing  approach  will 
also  be  useful  in  the  next  section. 

Theorem  4.6  For  a  net  N,  (f,  X)  E  TT{N)  if  and  only  if  there  is  some  w  E 
PRT{N)  such  that  v  =  a(w)  and,  for  each  x  £  X,  there  is  some  suffix  of  w 
where  x  is  in  all  refusal  sets. 


5  Two  Token-Passing  MUTEX-Processes 

In  this  section  we  will  show  how  useful,  in  fact  necessary,  read  arcs  are  to  achieve 
mutual  exclusion.  Both  our  processes  pass  an  access-token  around  and  only  the 
owner  of  the  token  may  access  the  critical  section,  which  guarantees  mutual 
exclusion.  MUTEXi,  shown  below,  is  a  modification  -  using  read  arcs  -  of  a 
Petri  net  solution  given  in  [KW95].  The  first  user  has  priority,  i.e.  owns  the 
access-token  lying  on  pi.  He  can  repeatedly  request  access  with  ri,  enter  the 
critical  section  with  ei  (marking  ci)  and  Zeave  it  with  li.  The  second  user  misses 
the  access- token  (m2  is  marked);  if  she  requests  access,  she  has  to  order  the  token 
by  marking  02 ,  and  now  the  first  user  might  prant  the  token  by  marking  p2  • 

For  MUTEXi  to  work  properly,  [KW95]  assumes  fairness  in  general:  e.g.,  if 
the  internal  transition  ordering  the  token  is  enabled,  it  has  to  fire  eventually; 
otherwise  the  token  will  never  be  passed  and  the  requesting  user  will  never  enter 
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the  critical  section.  In  our  solution,  it  is  essential  that  the  upper  ei-transition 
checks  with  a  read  arc  that  the  token  has  not  been  ordered.  This  check  does  not 
disable  the  ordering  transition;  so,  if  the  latter  is  enabled  and  time  progresses, 
then  it  will  order  the  token,  which  now  cannot  be  used  by  the  owner  to  enter 
the  critical  section  again  and  will  be  passed  eventually. 

reqj  req^ 


As  usual,  MUTEXi  is  seen  in  [KW95]  as  ‘code’,  which  has  to  be  inserted  into 
the  code  of  the  users;  e.g.  the  ri-transition  is  the  first  user  requesting  access. 
Since  the  first  user  should  not  be  obliged  to  request,  [KW95]  has  a  special  class  of 
‘weak’  transitions  for  which  fairness  is  not  assumed.  This  concept  is  not  needed 
in  our  view:  we  see  a  net  such  as  MUTEXi  as  a  scheduler  guaranteeing  mutual 
exclusion;  the  user  processes  are  put  in  parallel  with  such  a  MUTEX-process 
using  ||{ri,ei,fi,r2, 62,^2}?  issue  their  requests  to  it  and  are  then  allowed  to 
enter  the  critical  section.  In  this  view,  the  ri-transition  is  the  MUTEX-process 
offering  the  possibility  to  request;  if  this  offer  is  not  used,  then,  technically,  time 
can  pass  in  an  i-refusal  trace  with  a  refusal  set  not  containing  ri. 

Our  view  seems  to  be  very  beneficial  as  a  clean  way  to  deal  with  the  question 
what  users  do  while  being  noncritical;  they  may  e.g.  communicate  with  each 
other  and  even  run  into  deadlocks  -  it  is  not  completely  clear  whether  this  is 
allowed  in  the  usual  view.  Here,  it  obviously  is  allowed,  but  we  do  not  have  to 
deal  with  it  explicitly,  since  such  a  behaviour  is  not  part  of  the  MUTEX-process. 
The  obligation  to  prove  that  a  user  can  indeed  request  becomes  obvious  in  our 
view  -  this  obligation  is  often  ignored,  see  also  below. 

For  the  solution  of  [KW95],  fairness  is  actually  not  enough;  [KW95]  therefore 
requires  a  restricted  form  of  strong  fairness  by  introducing  ‘fair  arcs’.  We  will 
show  that,  using  read  arcs,  strong  fairness  is  not  needed  at  all. 

While  in  MUTEXi  the  token  has  to  be  ordered,  it  is  passed  automatically 
in  MUTEX2  below  if  it  has  been  used  or  is  not  needed.  The  check  whether  the 
token  is  needed  or  not  is  performed  by  the  read  arcs  from  nci  and  nc2. 

We  will  now  argue  in  our  setting  that  MUTEX2  is  correct,  omitting  the 
similar  arguments  for  MUTEXi.  Safety  is  easy:  if  one  user  enters,  then  he  must 
leave  before  another  enter  is  possible,  since  we  always  have  exactly  one  token 
on  the  places  ci,  pi,  p2  and  C2.  (This  set  is  an  S-invariant,  as  also  used  e.g.  in 
[KW95].)  Also,  MUTEX2  ensures  that  the  users  follow  the  right  protocol,  i.e. 
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it  allows  the  actions  €{  and  li  only  to  be  performed  cyclically  in  this  order. 
Liveness  -  i.e.  whenever  a  user  wishes  to  enter  he  will  be  able  to  do  so  eventually 
-  is  more  difficult  and  requires  to  assume  fairness.  First,  we  have  to  make  sure 
that  a  user  may  always  perform  a  request. 


Prop.  5.1  Let  (w,X)  E  J^T(MUTEX2)  and  i  E  {1,2}.  Then  in  w  ri  occurs 
and  each  U  is  followed  by  another  rj,  or  ri  ^  X. 

This  proposition  says  that  if  the  environment,  i.e.  the  i-th  user,  tries  to 
request  (enables  an  ri-transition  permanently)  at  a  proper  moment  (initially  or 
after  leaving,  i.e.  when  he  is  not  already  requesting  or  in  the  critical  section),  then 
the  request  will  be  performed.  If  it  were  not,  neither  the  user  (by  assumption) 
nor  MUTEX2  (by  5.1)  would  refuse  r^,  hence  the  combined  run  according  to 
4.5  ii)  would  not  refuse  r,,  i.e.  it  would  violate  fairness  according  to  4.5  i).  By 
Theorem  4.6,  we  can  formulate  5.1  equivalently  as:  each  w  E  PRT{MUTEX2) 
contains  ri  as  in  5.1  or  at  some  stage  no  following  refusal  set  contains  r».  The 
proof  of  5.1  uses  this  variant  and  shows  in  fact  that,  after  Zj,  can  be  refused 
at  most  once  before  it  occurs  again.  Similar  variants  are  used  to  prove  5.2  and 
5.3,  where  the  former  states  that  a  user  that  enters  and  then  wants  to  leave  will 
do  so.  (In  fact,  he  will  do  so  in  the  present  or  next  round.) 

Prop.  5.2  Let  {w,X)  E  TT{MUTEX2)  and  i  E  {1,2}.  Then  each  in  w  is 
followed  by  an  li,  or  li  ^  X. 

The  most  difficult  part  is  to  show  that  a  requesting  user  will  eventually  enter; 
here,  we  must  require  that  a  requesting  user  is  indeed  willing  to  enter  and  also 
that  a  user  that  enters  is  willing  to  leave  after  a  while.  Since  by  5.2,  willingness  to 
leave  ensures  that  this  happens  indeed,  we  can  restrict  attention  to  fair  failures 
where  each  is  followed  by  /»;  for  these  we  show  that  each  requesting  user  will 
enter  unless  some  user  has  requested  but  is  not  willing  to  enter. 

Prop.  5.3  Let  i  E  {1,2}  and  {w,X)  E  TT={MUTEX2)  such  that  each  Ci  is 
followed  by  U.  Then  either  each  ri  is  followed  by  e»  or  for  some  j  E  {1,  2}  some 
rj  is  not  followed  by  ej  and  ej  ^  X. 

We  now  come  to  the  main  result  regarding  the  expressiveness  of  read  arcs. 
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Theorem  5.4  Let  N  be  a  correct  MUTEX-process,  i.e.  a  net  that  satisfies 
Propositions  5.1  to  5.3  and  guarantees  mutual  exclusion,  namely  that  e-  and 
I -transitions  occur  alternatingly.  Then  N  has  read  arcs. 

Independently,  [KW96]  have  shown  a  similar  result.  For  correctness,  some 
state-properties  are  required  and  a  certain  net-structure  is  prescribed  there.  The 
latter  makes  the  result  quite  dependent  on  Petri  nets  as  system  models,  whereas 
our  MUTEX-specification  in  5. 1-5.3  is  action-oriented  and,  thus,  fairly  model- 
independent.  Also  our  proof  seems  to  be  transferable  to  other  models. 

One  could  also  view  5.4  as  evidence  that  a  ‘simple’  progress  assumption  is 
not  enough  to  achieve  mutual  exclusion,  as  argued  in  [KW96],  who  recommend 
‘fair  arcs’  as  a  way  to  introduce  strong  fairness  in  a  limited  way.  Read  arcs 
seem  less  drastic,  but  they  allow  a  ‘refined’  progress  assumption,  since  with  read 
arcs  repeated  read  accesses  to  one  location  do  not  block  a  write  access  to  this 
location.  This  is  a  restricted  form  of  what  [Ray 86]  calls  fairness  of  hardware. 

In  fact,  the  discussion  of  Dekker’s  and  Knuth’s  algorithms  in  [Ray86,  p. 27/28] 
might  give  the  impression  that  the  latter  does  not  rely  on  any  fairness  of  hard¬ 
ware  -  something  that  should  be  false  in  view  of  our  theorem.  And  it  is:  without 
this  fairness,  one  user-process  in  Knuth’s  algorithm  can  e.g.  repeatedly  test  the 
variable  turn  in  its  pre-protocol,  thereby  preventing  the  other  process  from  writ¬ 
ing  turn  in  its  post-protocol  and  in  effect  from  requesting  again.  Thus,  5.1  treats 
a  realistic  possibility  for  failure  that  is  often  ignored. 

We  conclude  the  discussion  of  the  MUTEX-problem  by  comparing  the  effi¬ 
ciency  of  MUTEXi  and  MUTEX2.  Our  results  are  intuitively  plausible,  hence 
they  demonstrate  the  feasability  of  our  approach. 

The  first  observation  is  that  both  processes  have  their  advantages:  if  there 
is  no  competition,  then  moving  the  access-token  to  the  other  part  of  the  net  is  a 
useless  and  time  consuming  effort;  on  the  other  hand,  if  the  competition  is  strong, 
ordering  the  token  is  an  additional  overhead.  This  is  demonstrated  by  the  follow¬ 
ing  i-refusal  traces.  If  in  MUTEX2  the  access-token  is  moved  to  p2  immediately 
before  ri,  then  t  becomes  urgent  only  in  the  second  round,  at  the  end  of  which 
ei  can  still  be  refused;  we  get  ri{ei}{ei}  G  RT{MUTEX2)  \  RT{MUTEXi) 
showing  that  sometimes  MUTEX2  is  slower  -  namely  if  the  second  user  is  not 
interested  in  entering  the  critical  section.  Vice  versa,  MUTEXi  is  sometimes 
slower  as  witnessed  by  7’2{e2}{e2}{e2}  6  RT{MUTEXi)\RT[MUTEX2)^  where 
an  additional  round  is  needed  to  order  the  token. 

RT {MUTEXi)  shows  how  efficiently  the  respective  MUTEX-process  serves 
the  environment  consisting  of  both  users.  Interestingly,  we  can  also  use  our 
approach  to  study  a  different  view:  how  efficiently  are  the  needs  of  the  first  user 
met  by  the  system,  which  for  him  consists  of  a  MUTEX-process  and  the  second 
user?  As  second  user,  we  take  a  standard  user  who,  in  the  non-critical  section, 
can  choose  between  requesting  with  r2  and  some  other  internal  activity;  if  she 
requests,  she  is  willing  to  enter  the  critical  section  in  the  next  round  and  to 
leave  it  again  in  the  round  after.  As  a  net,  this  user  looks  like  the  right  hand 
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side  of  MUTEX2,  i.e.  has  places  nc2,  req2  and  C2  and  the  transitions  between 
them,  plus  an  internal  transition  on  a  loop  with  nc2.  We  compose  this  user 
with  MUTEXi  via  ||{r2,ea,/3}  and  hide  the  synchronized  actions  (change  them  to 
A),  since  from  the  point  of  view  of  the  first  user  they  are  internal  activities  of 
the  system.  Thus,  MUTEXi  and  MUTEX2  are  transformed  to  MUTEX^  and 
MUTEX^.  It  is  plausible  that  MUTEX^  is  more  efficient  than  MUTEX3:  we 
consider  the  worst  case  efficiency;  naturally,  for  the  first  user  strong  competition 
is  the  worst  case,  and  in  the  case  of  strong  competition  MUTEX2  is  more  efficient 
since  it  saves  the  additional  effort  of  ordering  the  token. 

Theorem  5.5  i)  MUTEX^  is  strictly  faster  than  MUTEX3. 

ii)  The  efficiency  of  MUTEX2  and  that  of  MUTEXi  are  incomparable. 
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Abstract.  It  is  shown  that  bisimulation  equivalence  is  decidable  for  the 
processes  generated  by  (nondeterministic)  pushdown  automata  where  the 
pushdown  behaves  like  a  counter,  in  fact.  Also  regularity,  i.e.  bisimulation 
equivalence  with  some  finite-state  process,  is  shown  to  be  decidable  for  the 
mentioned  processes. 


1  Introduction 

In  recent  years,  growing  effort  has  been  devoted  to  the  area  of  verification  of 
(potentially)  infinite-state  systems.  An  important  studied  question  is  that  of 
(iin)decidability  for  various  (behavioural)  equivalences.  A  prominent  role  among 
these  equivalences  is  played  by  bisimulaiion  equivalence,  or  bisimilarUy,  which  is 
more  appropriate  for  (concurrent,  reactive  etc.)  systems  than  e.g.  the  traditional 
language  equivalence  (cf.  [MiI89]).  Roughly  speaking,  two  processes  (states  of 
systems)  are  bisimilar  iff  for  any  evolving  of  one  process  caused  by  performing  an 
action  labelled  a  there  is  an  action  labelled  a  which  causes  evolving  of  the  other 
process  in  such  a  way  that  the  resulting  processes  (states)  are  again  bisimilar. 

Several  recent  results  help  to  highlight  and  understand  the  decidability 
boundaries  for  bisimilarity,  which  are  different  from  those  for  language  equiva¬ 
lence.  It  is  e.g.  known  that  bisimilarity  is  decidable  for  Basic  Parallel  Processes 
([CHM93])  while  the  language  equivalence  is  undecidable  for  them  ([Hir93]). 
More  relevant  here  are  context-free  processes  (generated  by  context-free  gram¬ 
mars),  also  called  BPA-processes,  where  the  language  equivalence  is  well-known 
to  be  undecidable  while  bisimilarity  is  decidable  ([CHS95]).  Pushdown  automata 
(which  are  in  the  ‘language  sense’  equivalent  to  context-free  grammars)  generate 
a  richer  family  than  that  of  context-free  processes  when  considering  bisimulation 
equivalence.  These  pushdown  processes  can  be  identified  with  ‘state-pushdown’ 
configurations,  whose  behaviour  is  determined  by  the  transition  rules  (not  allow¬ 
ing  e-rules).  Recently  Stirling  ([Sti96])  has  shown  the  decidability  of  bisimilarity 
for  normed  pushdown  processes,  while  the  question  remains  open  for  the  whole 
class. 

Here  we  show  the  decidability  of  bisimilarity  for  another  subclass  of  pushdown 
processes:  we  will  not  impose  the  restriction  of  normedness  but  we  consider  the 
case  when  the  pushdown  behaves  like  a  counter,  in  fact;  i.e.  there  is  only  one 
stack  symbol,  besides  a  special  bottom  symbol  which  enables  to  test  ‘emptiness’ 
of  the  pushdown.  Let  us  call  such  processes  as  one-counter  processes.  The 
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decidability  result  for  one-counter  processes  also  confirms  the  conjecture  by  the 
author  ([Jan93])  that  bisimilarity  for  labelled  Petri  nets  with  one  unbounded 
place  is  decidable  (while  two  unbounded  places  suffice  for  undecidability). 

Semidecidability  of  nonbisimilarity  of  pushdown  processes  can  be  derived 
easily  in  the  standard  way  applied  for  image  finite  systems.  Therefore  semide¬ 
cidability  of  bisimilarity  is  what  matters  here.  In  similar  cases,  the  key  point  is  to 
show  that  the  bisimilanty  case  has  always  a  finite  (or  finitely  presented)  witness 
whose  validity  can  be  checked  algorithmically.  In  our  case,  at  the  one-counter 
processes,  the  role  of  such  witnesses  is  played  by  (descriptions  of)  semilinear 
sets;  this  approach  was  already  used  in  [Jan93]  or  [Esp95]. 

Roughly  speaking,  the  existence  of  such  witnesses  (i.e.  semilinear  bisimula¬ 
tions)  for  one-counter  processes  can  be  anticipated  from  the  intuition  that  two 
bisimilar  processes  have  to  have  the  same  distance’  (minimum  number  of  steps) 
to  a  ffiottom  process’  (configuration  with  only  the  bottom  symbol  in  the  push¬ 
down-counter)  when  such  bottom  processes  matter  at  all,  it  can  be  guessed 
that  then  the  counter  heights  of  such  processes  have  to  be,  in  principle,  linearly 
related.  The  possibility  of  an  algorithmic  checking  of  a  witness’  validity  can  be 
easily  observed  due  to  the  decidability  of  Presburger  arithmetic  (although  this 
deep  result  is  surely  not  needed  in  its  whole). 

Another  natural  decidability  question  is  that  of  regularity  of  a  given  process, 
which  will  in  our  context  mean  the  bisimulation  equivalence  with  some  finite- 
state  process.  This  problem  has  been  shown  to  be  decidable  for  labelled  Petri 
nets  ([JE96]),  which  include  BPP-processes.  In  [BCS96],  the  decidability  is 
shown  for  BPA-processes  (where  the  ‘language  regularity’  is  well-known  to  be 
undecidable).  The  question  for  the  whole  class  of  pushdown  processes  is  still  open 
(while  for  the  class  of  normed  pushdown  processes  is  easily  seen  to  be  decidable). 
As  an  additional  result,  we  demonstrate  that  regularity  is  also  decidable  for  one- 
counter  processes. 

In  fact,  one-counter  processes  can  be  ‘almost’  identified  with  labelled  Petri 
nets  with  one  unbounded  place;  but  unlike  Petri  nets  they  can  ‘test  for  zero’. 
Nevertheless  the  strategy  used  in  the  proof  of  decidability  of  regularity  for  la¬ 
belled  Petri  nets  ([JE96])  applies  for  them  as  well. 

Section  2  contains  definitions  and  claims  the  results;  the  proofs  are  given  in 
Section  3.  Section  4  adds  some  further  comments. 

2  Definitions  and  Results 

We  begin  with  recalling  some  standard  notions.  ^ 

A  labelled  transition  system,  a  system  for  short,  is  a  tuple  T  =  {S,  {  ^}ae>i) 

where  S  is  the  set  of  states,  A  is  the  set  of  actions  (or  action  names)  and  each 
is  a  binary  (transition)  relation  on  S  (-^C  S  x  S).  By  E  F  {E,  F  e  S) 
we  mean  that  E  F  ioT  some  a;  denotes  the  reflexive  and  transitive 
closure  of  the  relation  By  E  S'  {S'  is  reachable  from  E),  where  S'  C  S, 
we  mean  E  F  for  some  F  e  S' .  In  the  obvious  sense,  we  also  use  E  — F 
where  u  6  A*  ;  |u|  denotes  the  length  of  the  sequence  u. 
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A  transition  system  T  =  (5,  {-^}a€>t)  finite  iff  S  and  A  are  finite.  T 
is  linage  finite  iff  succ{E)  —  IJae^  sucCa(E)  is  finite  for  any  E  ^  S,  where  we 
define  sucCa{E)  =  {E'  \  E  ^  E'}. 

Speaking  of  a  process  E,  we  always  consider  it  as  (being  associated  with)  a 
state  in  a  transition  system  which  is  clear  from  the  context.  When  necessary, 
we  denote  the  relevant  transition  system  by  'E(E).  Using  the  term  of  a  finite, 
or  rather  a  finite-state,  process  E,  we  mean  that  T{E)  is  finite;  similarly  for  an 
image  finite  process. 

A  binary  relation  TZ  between  processes  is  a  bisimulation  relation  provided 
that  whenever  {E,  F)  G  E,  for  each  action  a 

if  E  E'  then  there  is  F'  s.t.  F  F'  and  {E\  F')  G  'll,  and 

if  F  F'  then  there  is  E'  s.t.  E  E'  and  {E\  F')  G  'IZ. 

Two  processes  E  and  F  are  bisimulation  equivalent,  or  bisimilar,  written  E  F, 
if  there  is  a  bisimulation  relation  TZ  relating  them. 

The  family  {~n|  ^  >  0}  (of  relations  between  processes)  is  defined  induc¬ 
tively: 

1/  ^  for  all  processes  E,  F 

2/  E  F  iff  for  each  a 

if  then  there  is  F'  s.t.  F  — ^  F'  and  E'  F' ,  and 

if  F  F’  then  there  is  T"'  s.t.  E  E'  and  E'  F' . 

Let  us  recall  some  ‘folklore’  results. 

Proposition  2.1  For  image  finite  processes,  E  ^  F  iff\fn>0:E^nF. 

Let  us  call  T  =  (S,  admissible  system  iff  the  state  set  S  is  finite 

or  countably  infinite  (identified  with  a  set  of  sequences  over  a  finite  alphabet), 
the  action  set  A  is  finite,  T  is  image  finite,  and  all  the  successor  functions 
succa  :  S  — ^  2*^  are  effectively  computable. 

Proposition  2.2  Considering  only  admissible  transition  systems,  all  the  rela¬ 
tions  E  F  (n  ^  J\f )  are  decidable.  Therefore  the  problem  E  ^  F  is  semide- 
cidable. 

Now  we  define  the  pushdown  processes  (cf.  e.g.  [Sti96]);  loosely  speaking, 
these  are  state-pushdown  configurations  of  a  given  (nondeterministic)  pushdown 
automaton  without  £-rules.  Then  we  introduce  the  ‘one-counter  case’. 

Suppose  a  given  collection  (i.e.  a  pushdown  automaton  viewed  as  a  ‘push¬ 
down  process  generator’)  M  =  {V,T,A,B)  where  V  ^  {pi,p2,  •  ■  ■  ,Pk}  is  ^ 
finite  set  of  states,  T  —  {Xi,  X2, . . . ,  Xm}  is  a  finite  set  of  stack  symbols, 
A  =  {0.1,02,  ...  is  a  finite  set  of  actions,  and  6  is  a  finite  set  of  basic 
transitions,  each  of  the  form  pX  — ^  qa  where  p,  q  are  states,  a  is  an  action,  X 
is  a  stack  symbol  and  cr  is  a  sequence  of  stack  symbols  (i.e.  a  G  L*).  The  transi¬ 
tion  system  Tm  generated  by  M  has  the  expressions  pa  (p  G  P,  a  E  L*),  called 
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pushdown  processes,  as  states,  A  is  its  action  set,  and  the  transition  relations 
are  in  the  straightforward  way  determined  by  the  basic  transitions  together  with 
the  following  prefix  rule:  if  pX  — ^  qoi  then  pXfi  »■  qct/d  (for  ^  G  F  ). 

When  r  =  {X,  Z}  and  any  basic  transition  is  of  the  form  pX  qa  or 
pZ  XU  qa-Z  where  c.  G  {X}*  (we  call  M  =  (V,T,A,8)  a  one-counter  machine 
in  such  a  case),  then  any  pXX  ...XZ  is  called  a  one-counter  process.  For 
convenience,  a  process  pX^Z  will  be  denoted  by  p{m)  (m  G  AT,  where  Af  denotes 
the  set  of  all  nonnegative  integers). 

Notice  that  any  process  reachable  from  a  one-counter  process  is  a  one-countei 
process  as  well.  Thus  for  a  one-counter  machine  M  we  can  safely  suppose  that 
Tm  has  states  of  the  form  p{m)  only. 

Our  main  aim  here  is  to  show 

Theorem  2.3  Bisimulation  equivalence  is  decidable  for  one-counter  processes. 

More  precisely  it  means  that  there  is  an  algorithm  which  inputs  (descriptions 
of)  two  one-counter  processes  p(m),  p'{m')  together  with  the  respective  one- 
counter  machines  M,  M',  and  after  a  finite  amount  of  time  answers  whether  or 
not  p(m)  ~p'(m'). 

An  additional  result  is  expressed  in  the  following  theorem;  here  a  process  h 
is  called  regular  if[  there  is  a  finite-state  process  p  s.t.  E  ^  p. 

Theorem  2.4  Regularity  (wri  bisimilarity)  is  decidable  for  one-counter  pro¬ 
cesses. 

Each  of  the  two  decidability  results  is  implied  by  two  semidecision  procedures. 
We  can  immediately  note  that  semidecidability  of  nonbisimilarity  E  F  follows 
from  Proposition  2.2  since  one-counter  systems  (as  well  as  pushdown  systems) 
are  obviously  admissible. 

We  finish  this  section  by  recalling  some  known  notions  and  results  which  are 
then  used  in  the  proofs  in  Section  3. 

Given  a  transition  system  T  =  (t^,  define  the  class  of  all 

n-incompatible  processes  as  INC'l  —  {E  \  'iF  ^  S  :  E  'f-n  F}. 

More  specific  variants  of  the  following  two  propositions  were  used  in  [JM95], 

[JE96]. 

Proposition  2.5  For  any  n,  E  ^  F  implies  that  E  F  and  E  -fr*  INCn 
In  addition,  the  implication  can  be  reversed  for  any  n  s.t.  ~n-i  coincides  with 
(and  hence  with  on  T(F). 

Corollary  2.6  Let  A  be  a  finite  transition  system  with  k  states.  For  any  states 
p,  q,  it  holds  that  p  q  Iff  P  ""k  q  (iff  P  ^  q)-  F  yields  for  any  process  E  and 

a  state  p  of  A:  E  ^  p  iff  E  P  F  -jX  INC^. 

The  distance  of  a  process  E  to  F,  denoted  by  Dist{E ,  F),  is  the  length 
of  the  shortest  sequence  u  s.t.  E  — ^  F',  if  F  is  not  reachable  from  E,  we 
put  Dist(E,F)  =  oo.  For  a  set  E  of  processes,  we  define  Dist(E,E)  = 
min{Dist[E ,  F)  \  F  G  E}. 
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Proposition  2.7  If  E  F  then  Dist{E,F)  -  Dist{F,F)  for  any  quotient 
class  F  of  on  the  set  of  all  processes. 

We  need  the  notion  of  semilinear  sets.  An  important  fact  is  that  they  are 
precisely  the  sets  expressible  in  Presburger  arithmetic  (cf.  [GS66]);  we  will  use 
it  implicitly  when  arguing  that  some  sets  are  semilinear. 

A  set  1/  C  A/’^'  of  vectors  (r  >  1)  is  linear  if  there  is  a  base  vector  y  and 
period  vectors  xi,X2, .  • . ,  in  W  such  that  V  =  {y-\-  CiXi  \ci  eAf}. 
V  is  semilinear  if  it  is  a  finite  union  of  linear  sets. 

In  fact,  here  we  are  mainly  interested  in  dimensions  r  =  1,2.  The  next  fact 
on  one- dimensional  semilinear  sets  is  easily  derivable: 

Proposition  2.8  Suppose  a  set  V  C  J\f.  Then: 

1/  If  there  are  c,5  ^  Jf  s.t.  Vm  >  c  :  m  E  V  ^  m  6  E  V  then  V  is 
semilinear. 

S/  If  V  is  semilinear  then  there  are  constants  c  and  A  s.t.  for  any  m  >  c, 
the  value  in  mod  A  determines  whether  m  E  V  or  m  ^  V . 

3  Proofs 

In  this  section  we  always  (implicitly)  suppose  a  given  one-counter  machine  M 
with  k  states  (and  the  stack  alphabet  {X,  Z});  the  states  are  denoted  by  p,q 
(often  primed  or  with  subscripts). 

Subsection  3.1  proves  the  crucial  fact  of  this  paper  (Proposition  3.3)  which 
shows  that  the  set  {(?n,  n)  \  p{m)  q(n)}  is  semilinear  for  any  p,  q.  Subsections 
3.2  and  3.3  then  prove  the  theorems. 

In  the  proofs  we  need  the  notion  of  the  underlying  automaton  Am  which 
behaves  like  M  as  long  as  the  bottom  of  the  stack  is  not  reached,  and  also  the 
notion  of  processes  which  are  ‘Basically  Incompatible’  with  (states  of)  Am- 
The  underlying  finite  autom.aton  Am  (viewed  as  a  finite  transition  system) 
has  the  same  set  of  states  as  M,  and  it  has  the  transition  p  ^  iff  M  has  a 
basic  transition  pX  qa  (o'  E  {A}*). 

We  define  Bine  =  {p{m)  \  p{m)  E  INC^^}  =  {p(u^)  |  p(^)  '/'k  q  for  each 
state  q}. 

When  we  observe  that  p(m)  P  for  >  k,  the  next  lemma  is  clear: 

Lemma  3.1  If  p(m)  E  Bine  then  m  <  k.  Therefore  Bine  is  a  finite,  and 
effectively  computable,  set. 

Due  to  corollary  2.6  we  can  add  (recall  that  k  denotes  the  number  of  states 
of  M  and  hence  also  of  Am)'- 

Lemma  3.2  For  m  >  k  (and  any  state  p),  p(m)  7^  p  iff  p{m)  Bine. 
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Notation.  By  p{m)  q{n)  (r  G  N)  we  mean  that  there  is  a  path  p{m)  - 

^  ?2(^2)  — ^  ^  —  q{'^)  ^  r  for  i  —  By 

p(?n)  -^po5  (positive)  we  mean  that  p(m)  q{n). 

Observe  the  obvious  fact  (used  implicitly  in  what  follows):  if  r  >  1  then 
pirn)  -*>,  g(n)  iff  p{m  +  6)  q{n  +  S)  for  any  6  e  In  particular 

p(7n)  ^pos  g(n)  implies  p{m.  +  6)  -^pos  ^(^  +  <^)- 

3.1  Semilinearity  Proof 

This  subsection  is  devoted  to  a  proof  of  the  next  crucial  proposition: 

Proposition  3.3  For  any  one-counter  machine  and  its  states  p,  q,  the  set 
{(??7, 77.)  I  p(??2)  ~  g(n)}  is  semilinear. 

First  observe  that  if  ^(tti)  Bine  and  g(n)  Bine  then  surely  p{m)  7^ 
g(77.)  (cf.  Proposition  2.7).  Therefore  the  set  B  =  {(m,n)  |  p(m)  -  q{n)]  can  be 
written  as  B  =  Bi  U  B2  where 

Bi  =  {(777, 77)  I  p(777)  ~  g(77),p(777)  7^*  BInc,  q{n)  7^*  Bine], 

B2  =  {(777.,  77.)  I  p(777)  ~  g(77),p(777)  BIne,q{n)  Bine}. 

Therefore  it  suffices  to  show  semilinearity  of  B\  and  52- 

The  next  lemma  is  a  means  for  proving  semilinearity  of  Bi . 

Lemma  3.4  For  any  state  p  (of  the  one-counter  machine  M),  the  set  {?77  | 
_+*  Bine]  is  semilinear;  therefore  also  {m  \  p(m)  Bine]  is  semilinear. 

Proof:  Recall  that  we  suppose  M  with  k  states;  let  V  be  the  state  set. 

We  have  to  show  semilinearity  of  R  =  {777  |  p(?77)  Bine}.  For  any  Q  CV 
we  define  the  set  Rq  C  R  as  follows:  m  G  Rq  iff  there  is  a  ‘^witness  path 

p(777)  =  gi(77i)  — ^  52(^2)  qsi^s)  €  BInc  (1) 

s.t.  g,:  G  Q  for  ?■  =  1, 2, . . . ,  5'  where  s'  <  5  is  the  maximum  number  s.t.  77^  >  1 
for  7  =  1,  2, . . . ,  s'  (the  path  goes  through  states  from  Q  solely  while  after  the  first 
reaching  of  the  stack  bottom  -  if  it  happens  at  all  -  there  are  no  restrictions). 

It  is  clear  that  Rp  =  R  and  it  suffices  to  show  semilinearity  of  all  Rq.  We 
proceed  by  induction  on  \Q\. 

When  Q  =  0  then  Rq  is  obviously  semilinear  {Rq  -  0  or  Rq  =  {0}). 

Now  we  show  semilinearity  of  Rq,  [Qj  >  0,  while  supposing  semilinearity  for 
each  Rq>,  |Q'|  <  |Q1.  Let  some  m  >  2k  he  in  Rq  (otherwise  Rq  is  finite,  hence 
semilinear)  and  let  (1)  be  a  relevant  witness  path;  recall  that  k  ;>  ng  (Lemma 
3.1).  We  can  take  the  leftmost  subsequence  gii(777),  qi^{m-l),  .  . .,  gu+i(777-^); 
due  to  the  pigeonhole  principle,  there  is  g  =  qi^  ^  g^i,  for  a  ^  b.  Therefore 
pM  ^>n',  ^(^2)  Qs{ns)  e  BInc  where  6  =  n[  -  n'^  >  0, 

71.2  >  O5  hence  g(77.  +  6)  — g(77)  for  any  n  >  0. 
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We  can  write  Rq  =  Rq  U  RQ\{q}  where 

{777  6  Rq  I  there  is  a  witness  path  with  q  —  qi  for  some  i,  1  <  «  <  5'}. 

Since  m  G  R'q  obviously  implies  m  +  <5  G  is  semilinear  (cf.  Proposition 

2.8  1/);  semilinearity  of  RQ\{q}  follows  from  the  induction  hypothesis.  □ 

Corollary  3.5  Bi  =  {(m,77)  |  p{m)  ~  q{n),p{m)  7^*  BInc,q{n)  -^*  Bine]  is 
serm  linear. 

Proof:  Given  r  <  k,  consider  5i(7’,  — )  =  {77  |  (r,  77)  G  5i}  Note  that  for  any 
77  G  Bi(r,—),  n  >  k  implies  ^(77)  ~  q.  Therefore  when  i?i(r,  — )  is  infinite,  it 
is  the  union  of  a  finite  set  and  the  set  {77  >  A:  |  q{n)  Bine};  in  any  case, 
Bi{r,~)  is  semilinear.  Semilinearity  of  i?i(— ,r)  ==  {?77  |  (m,  r)  £  Bi}  can  be 
established  similarly.  Bi  can  be  written 

A:-l  k-1 

Bi=  [J  {(’■'”)  I  "  ^  ^  U  ’■)  I  ™  S  ’■)} 

r=0  r=0 

where 

B[  —  {(777, 77.)  I  777  >  Ar,  77  >  k,p(m)  ^  5(77), ^(777)  7^*  Bine,  q{n)  Bine). 

B[  is  either  empty  (when  p  7^  q)  or  equals  to  {(777,77)  |  7?7  >  ^,  77  >  ^%p(777)  7^* 
Bine,  q{n)  7^*  Bine}  (when  p  q). 

Thus  semilinearity  of  Bi  is  clear.  C 

We  also  need  another  corollary. 

Corollary  3.6  There  are  constants  c  and  A  s.t.  for  any  p  and  any  m  >  c,  the 
value  m.  mod  A  determines  whether  or  not  p{m)  Bine, 

Proof:  For  any  state  p,  we  get  the  relevant  Cp,  Ap  due  to  Proposition  2.8  2/. 
The  constant  c  desired  here  can  be  taken  as  the  maximum  of  Cp’s  and  A  can  be 
taken  as  the  product  of  Ap’s.  O 

Our  aim  now  is  to  show  semilinearity  of  B2  =  {(?77, 77)  |  p(?77)  ~  ^(?7),p(777) 
BInc,q(7i)  Bine}. 

Notation.  Dist{p(m),  Bine)  will  be  denoted  by  Dist(p{m))  for  short. 

Since  Dist{p(m))  =  Dist{q{n))  is  a  necessary  condition  for  ^(777)  ~  q{n),  we  will 
explore  which  relation  it  imposes  for  m  and  n.  First  we  show  that  Dist{p(m))  is, 
in  fact,  linear  (when  finite)  in  777  with  the  provision  that  the  coefficient  depends 
on  777  mod  A.  Here  and  further,  A  is  taken  from  Corollary  3.6. 

Lemma  3.7  There  is  a  constant  d  eM,  and  for  any  state  p  and  any  congruence 
class  (Ornorf  a  there  is  a  rational  constant  k'  s.t.  the  following 

holds  for  any  m,m  =  i(mod  A);  if  Dist{p{m))  is  finite  then 

Dist{p{m))  G  {k'm  —  d,  k'm  +  d). 
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Proof:  Suppose  some  p  and  (i)Yaod  a*  proof,  for  each  number  denoted 

by  rn  we  implicitly  suppose  m  =  z(mod  A).  We  show  that  there  are  k'  and  d 
s.t.  Disi(p(m))  6  {k'm,  -  d! ,  k'm  +  d'),  by  which  we  will  be  done  (the  desired  d 
can  be  taken  as  the  maximum  of  all  relevant  constants  d'). 

Observe  that  p(m)  Bine,  for  a  large  m,  implies  a  decreasing  cycle: 

p{m)  -^pos  -^>n  ?(^)  q,n>0,6>0. 

Let  Q  =  {q  \  p{rn)  ^*pos  m,n).  Now  let  q'  be  a 

state  of  Q  which  allows  a  decreasing  cycle  q^{n  +  6^)  -^>n  q'{'>^)  (foi‘  some  w, 
Su,  >  0,  and  all  n  >  1)  with  the  best  decreasing  rate  -  i.e.  by,/\w\  is  maximal 
possible.  The  existence  of  such  q'  can  be  easily  derived  (by  ‘pigeonhole  principle 
reasoning’  we  could  suppose  |i(;|  <  k).  Moreover  we  can  safely  suppose  that 
6yj  is  a  multiple  of  A  (otherwise  we  take  which  yields  the  same  decreasing 
rate),  and  thus  q'{n  -i-  — J-*  Bine  iff  q\n)  — >*  Bine  for  n  >  c,  c  taken  from 

Proposition  3.6. 

Let  us  choose  m  >  c  +  ^  s.t.  p{m)  ~^pos  q'i'^)  — Bine  for  some 

u  and  n,c  <  n  <  cd-  dnj]  denote  8^  -  m  ~  n.  Note  that  p{m  +  jA)  -^pos 
q'{n  -H  jA)  — Bine  for  any  j  >  0. 

Now  let  do  -  |u|,  di  =  max{Dist{p\c  +  x))  \  x  e  {0, 1, . . . ,  ^«^)  and 
Dist{p'{c  +  a?))  is  finite  }.  Then  it  is  clear  that  for  any  m  >  c-\-  +  k 

Dist{p{m))  <  do  +  ^(m  —  8u  —  c)/ <5^;^  |?n|  +  di 

On  the  other  hand  it  is  easily  verifiable  that 

Dist(p(m))  >  (^{m.  -  8^  ~  c)/8yj  -  |u;|. 

Calculating  the  desired  k\  d'  is  now  a  technical  routine  (d  has  to  be  chosen 
large  enough  to  ‘cover’  the  finitely  many  m  <  c  +  6^;  +  ^  as  well).  L) 

Corollary  3.8  There  zs  a  constant  d  e  Af  s.t,  for  any  p,  q  and  congruence 
classes  (f)mod  A'  O') mod  A^  ®  rational  constant  k*  s.t.  the  following 

holds  for  any  m.,  n,  m  =  i\mod  A),  n  =  i{mod  A):  if  Dist{p{m))  =  Dist{q{n))  < 
oo  then,  n  G  (k'm  —  d,  k'm  +  d). 

Proof:  Because  there  are  constants  k2  and  d'  s.t.  Dist{p{m))  G  (^im  - 
d',  kiin  +  d')  and  Dist{q(n))  G  (k^n  -  d^  k2n  +  d')  then  it  must  hold  k2n  -  d'  < 
kim  +  d'  and  A?2n  +  d'  >  kim  -  d'.  Hence  we  have  mki/k2  -  2d'/k2  <  n  < 
mky/k2d-2d'/k2.  ° 

Recall  that  our  aim  is  to  show  semilinearity  of  52-  We  already  know  that 
there  is  d  G  A/*  and  a  finite  set  K  =  {/?!,  k2,  ^  ,  ^r}  of  rational  constants  s.t.  it 

suffices,  for  each  k'  e  K ,  to  show  semilinearity  of  the  set 

Bfci  ^  {(m,  n)  I  p{m)  ^  q{n),  n  G  (k'm  —  d,  k'm  +  d)}. 
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(The  union  of  Bk>'s  consists  of  B2  and  a  subset  of  Bi  which  is  obviously  semi- 
linear,  i.e.  expressible  in  the  Presburger  arithmetic). 

In  fact,  we  will  consider  only  the  subset  of  Bk>  where  m  >  c  for  a  sufficiently 
large  c  (the  rest  being  finite  and  therefore  causing  no  problems);  c  will  be  chosen 
so  that  for  any  m,  n,  m  >  c,  \n-k'm\  <  d,  the  following  holds:  for  any  p' ,  q'  and 
any  moves  p'{m)  p"{m'),  q'{n)  q'\n')  it  is  ensured  that  \n'  —  k"m'\  >  d 

for  each  k”  E  Ah  k^'  7^  k^  (a  pair  of  moves  cannot  lead  from  ‘B^t^-area’  into 
‘i?fc//-area’). 

Given  k' ,  let  us  denote  Cut{m)  =  C\'^QCuti(m)  where  Cuti{m.)  =  {(p',  q' ,  x)  \ 
X  E  {-d,  -d+  1, .  .  . ,  d},p'{7-n)  q\round{k'm)  +  x)}. 

Observe  that  there  surely  is  an  infinite  sequence  mo  <  mi  <  m2  <  . . . 
s.t.  for  all  i  >  0:  k'nii  is  integer,  mj+i  —  rrii  =  0(mod  A),  —  k'mi  = 

0(mod  A).  Since,  for  any  m,  Cut{m)  is  a  boundedly  finite  set,  there  are  surely 
771,  m/  satisfying  the  assumption  of  the  next  lemma;  and  it  is  easily  observable 
that  the  lemma  demonstrates  semilinearity  of  Bk>  and  thus  finishes  the  proof  of 
Proposition  3.3. 

Lemma  3.9  When  Cut{m)  =  Cut{m')  for  sufficiently  large  m  where  m  <  m' , 
k'ln.k'm'  are  integers,  m'  —  m  =  0{mod  A),  k^m'  —  k'm  =  Q{mod  A),  then 
Cut{7n  -|-  6)  =  Cutim!  -1-  6)  for  any  8  >  0. 

Proof:  We  show  Cut{m  +  6)  C  Cut{m'  +  6)  while  the  other  inclusion  will  be 
completely  symmetric. 

In  fact,  we  show  by  induction  on  i  that  {p,q,x)  E  Cut(m  +  6)  implies 
(p,  q,  x)  E  Cuti{m'  -\-  6)  for  all  i\  for  z  =  0  it  is  trivial  as  well  as  for  ^  0. 

Induction  hypothesis:  for  any  p,  q,  x,  6,  if  (p,  q,  x)  E  Cut{m-\-6)  then  (p,  q,  x)  E 
Cutifm!  -}-  ^). 

Now  we  consider  arbitrary  (but  fixed)  p,q,x,6  >  1  s.t.  (p,  q,  x)  E  Cut{m-l6) 
and  we  show  that  (p,  q,x)  E  Cuti+i{m^  +  S)  by  which  the  whole  proof  will  be 
finished. 

In  other  words,  denoting  mi  =  ml-d,  ni  =  round{k\m+6))-\-x,  m.2  =  m/  +  6, 
77,2  round{k' {m'  +  <5))  +  x,  we  suppose  p(mi)  ~  q{ni)  and  we  have  to  show 
p(7772)  ~^•+l  ^(n2). 

Let  p(7722)  p'[rn2  -\-  y)  (—1  <  y  <  max,  max  depending  on  the  machine 
M).  There  is  the  corresponding  move  p(mi)  — ^  p^(m.i  +p)  and  there  has  to  be  a 
move  7(77.1)  q'ini-\-z)  (—1  <  2:  <  max)  s.t.  p^(mi-fp)  ~  q'{ni-\-z).  We  claim 
that  the  corresponding  move  7(772)  q'in2-\-z)  yields  p'(?772  -i-p)  7^772  +  '^^)- 

When  |^'(m.i  +  p)  —  (771  +  z)\  <  d  (hence  also  |^'(m2  +  p)  —  (772  +  2:)|  < 
d),  it  follows  from  the  inductive  hypothesis.  Otherwise  Dist{p^{mi  +  p))  = 
Dist{q'{ni-\-z))  —  00  andp'  ~  7'.  But  then  also  Dist{p' {m2-\-y))  =  Dist(q^ {n2-\- 
z))  =  00  (recall  the  property  of  A);  therefore  p'(7772  +  p)  ~  q'(n2  -b  z). 

The  remaining  parts  of  the  proof  are  completely  similar.  □ 


3.2  Decidability  of  Bisimilarity 

Now  we  can  provide  a  proof  for  Theorem  2.3: 
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Theorem.  Bisimulation  equivalence  is  decidable  for  one-counter  processes. 

Proof:  First  notice  that  we  can  always  consider  the  bisimilarity  problem  in¬ 
stance  ~  where  pi^TYij,  associated  to  the  same  one-counter 

machine  (which  can  be  achieved  by  taking  the  union  of  two  machines  —  i.e.  union 
of  action  sets,  and  disjoint  union  of  state  sets  and  basic  transition  sets). 

Recall  that  it  suffices  to  show  semidecidability  for  'p{m)  ~  q{n)  V  (cf. 
Proposition  2.2).  Now  due  to  Proposition  3.3  it  suffices  to  generate  all  bisimu¬ 
lation  candidates  TZ  s.t.  the  set  {(m',n')  |  {p' ^  semiline ar  for 

each  pair  of  states  p' ,q' ,  and  for  each  such  candidate  to  check  if  Tl  actually  is  a 
bisimulation  containing  (p(rn) ,  q{n)) .  (Descriptions  of)  such  candidate  relations 
can  be  obviously  generated  in  a  systematic  way,  and  the  condition  to  be  checked 
is  easily  seen  to  be  expressible  in  Presburger  arithmetic,  which  is  decidable  (cf. 
e.g.  [Opp78]).  ° 

3.3  Decidability  of  Regularity 

Here  we  provide  a  proof  for  Theorem  2.4: 

Theorem.  Regularity  (wrt  biszmilarity)  zs  decidable  for  one-counter  processes. 

Proof:  Semidecidability  of  regularity  of  p{m)  follows  from  Theorem  2.3.  (We 
can  generate  all  finite  state  processes  viewed  as  special  cases  of  one-counter 
processes,  and  to  check  for  each  of  them  whether  p{m)  ~ 

Semidecidability  of  nonregularity  will  follow  when  we  show  that  p{m)  is  non¬ 
regular  iff  there  is  a  path 

p(m)  p'i^i)  -^*pos  -^*pos  “"P05  ^^(^2)  ^  Bine 

where  mi  <  7712,  ui  >  n2- 

The  existence  of  such  a  path  ensures  for  any  i  >  0  that 

pi^rn)  p'{m2+iini-n2){m2-mi))  q'{ni-\-i{m2-mi){ni~n2))  Bine 

v\diich  implies  that  there  are  reachable  states  with  arbitrarily  large  (but  finite) 
distances  to  Bine  -  and  this  obviously  implies  nonregularity  of  p(m).  The 
opposite  direction  can  be  also  easily  established.  ^ 


4  Further  Comments 

The  example  of  a  pushdown  process  used  in  [Sti96] 

pX  pXX,pX  qe,pX  re,  qX  sX,  sX  qe,  rX  re 

can  be  easily  transformed  in  a  one-counter  process  with  the  isomorphic  transition 
system.  This  process  can  serve  as  an  example  of  a  one-counter  process  which 
is  not  equivalent  to  a  BPA-process,  nor  a  BPP-process,  and  when  adding  a 
rule  pX  qfin  we  get  a  one-counter  process  not  equivalent  to  any  normed 
pushdown  process. 
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Abstract.  We  address  the  verification  problem  of  FIFO-channel  systems 
by  applying  the  symbolic  analysis  principle.  We  represent  their  sets  of 
states  (configurations)  using  structures  called  CQDD’s  combining  finite- 
state  automata  with  linear  constraints  on  number  of  occurrences  of  sym¬ 
bols.  We  show  that  CQDD’s  allow  forward  and  backward  reachability 
analysis  of  systems  with  nonregular  sets  of  configurations.  Moreover,  we 
prove  that  CQDD’s  allow  to  compute  the  exact  effect  of  the  repeated  ex¬ 
ecution  of  any  fixed  cycle  in  the  transition  graph  of  a  system.  We  use  this 
fact  to  define  a  generic  reachability  analysis  semi-algorithm  parametrized 
by  a  set  of  cycles  0.  Given  a  set  of  configurations,  this  semi-algorithm 
performs  a  least  fixpoint  calculation  to  construct  the  set  of  its  successors 
(or  predecessors).  At  each  step,  this  calculation  is  accelerated  by  consid¬ 
ering  the  cycles  in  0  as  additional  “meta-transitions  in  the  transition 
graph,  generalizing  the  approach  adopted  in  [5]. 

1  Introduction 

Analyzing  the  behaviour  of  systems  relies  basically  on  solving  reachability  prob¬ 
lems  in  their  models,  that  are  in  general  finite-state  automata  supplied  with 
(possibly  unbounded)  data  structures  (Petri  nets,  timed  or  hybrid  automata, 
fifo-channel  systems,  etc).  It  is  therefore  fundamental  to  compute  the  set  of  all 
successors  or  all  predecessors  of  a  given  set  of  states  S,  i.e.,  the  set  of  states  that 
are  reachable  from  S,  or  those  from  which  it  is  possible  to  reach  S. 

Let  post{S)  (resp.  pre{S))  denote  the  set  of  immediate  successors  (predeces¬ 
sors)  of  the  set  5,  and  let  post*{S)  {pre*{S))  denote  the  set  of  all  its  successors 
(predecessors).  Clearly,  post*{S)  is  the  limit  of  the  infinite  increasing  sequence 
lXi)i>o  with  Xo  =  S  and  X^+i  =  XiUpost{Xi)  for  every  i  >  0.  Similarly,  pre*{S) 
is  the  limit  of  the  infinite  sequence  obtained  by  considering  pre  instead  of  post. 

Unfortunately,  for  any  interesting  class  of  infinite-state  systems,  the  sets  Xi 
are  in  general  infinite  and  the  sequence  (Xi)i>o  is  not  guaranteed  to  reach  its 
limit.  Hence,  the  first  problem  is  to  find  a  class  of  finite  structures  that  can 
represent  the  infinite  sets  of  states  we  are  interested  in.  This  class  of  structures 
should  be  effectively  closed  under  union  and  the  post  and  pre  functions  such  that 
the  A"i’s  can  be  calculated.  Moreover,  to  compare  two  sets  and  to  check  whether  a 
given  state  belongs  to  an  infinite  set,  the  membership  and  the  inclusion  problems 
of  the  class  should  be  decidable. 
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For  instance,  for  systems  manipulating  integer  or  real  valued  variables  (Petri 
nets  or  timed  and  hybrid  automata),  representation  structures  based  on  polyhe- 
dra  or  sets  of  linear  constraints  are  used  [3,  6,  2,  13].  In  systems  manipulating 
sequential  data  structures  like  stacks  or  queues  sets  of  states  are  vectors  of  words, 
and  automata-based  representation  structures  can  naturally  be  used. 

Another  problem  is  the  convergence  of  the  sequence  of  X^’s.  In  general  this 
sequence  never  reaches  its  limit  and  an  exact  acceleration  of  the  computation  of 
the  limit  is  considered  by  defining  another  increasing  sequence  (li)i>o  such  that 
for  every  i  >  0,  X^  C  and  Yi  C  Ui>o  This  approach  has  been  used  [9,  7] 
to  define  model-checking  algorithms  for  pushdown  systems  using  (alternating) 
finite-state  automata  to  represent  sets  of  stack  contents. 

In  [5],  finite-state  automata-based  structures  called  QDD’s  are  used  to  repre¬ 
sent  queue  contents  of  fifo-channel  systems  (communicating  finite-state  machines, 
CFSM).  However,  contrary  to  the  case  of  pushdown  systems,  the  set  of  reachable 
states  of  a  CFSM  is  not  regular  in  general,  and  hence  not  QDD  representable. 
Moreover,  there  is  no  algorithm  allowing  to  construct  the  set  of  reachable  states 
even  if  we  know  that  it  is  regular  [10,  12,  1].  To  face  this  problem  [5]  proposes  an 
acceleration  technique  based  on  adding  to  each  Xj+i  the  set  of  states  postJ(Xi) 
which  corresponds  to  the  set  of  all  successors  after  repeating  as  much  as  possible 
a  cycle  0  of  a  special  kind  (called  meta-transitions).  The  restriction  on  the  nature 
of  6  guarantees  that  the  post*Q  image  of  a  regular  set  is  also  regular. 

In  this  paper,  we  also  consider  CFSM’s  and  propose  a  generalization  of  the  ap¬ 
proach  adopted  in  [5]  by  allowing  an  exact  acceleration  of  the  fixpoint  calculation 
with  the  successors  by  any  cycle  in  the  transition  graph  of  the  system.  The  diffi¬ 
culty  comes  from  the  fact  that  the  set  of  reachable  states  by  a  cycle  is  in  general 
nonregular.  Therefore,  we  propose  a  representation  structure  called  CQDD  (con¬ 
strained  QDD)  allowing  the  representation  of  such  sets.  This  structure  is  based 
on  a  combination  of  (simple)  finite-state  automata  with  Presburger  arithmetics 
formulas  expressing  constraints  on  the  number  of  occurrences  of  symbols. 

We  show  that  CQDD’s  satisfy  the  desirable  properties  of  a  representation 
structure  mentioned  above.  Moreover,  and  this  constitutes  our  main  result,  we 
prove  that  the  class  of  CQDD  representable  sets  of  states  is  effectively  closed 
under  the  function  post^  for  every  cycle  0.  We  prove  also  that  the  class  of  CQDD 
reverse  representable  sets  of  states  (their  reverse  image  is  CQDD  representable) 
is  effectively  closed  under  the  function  prej  for  every  cycle  6.  These  results  allow 
to  define  a  generic  reachability  analysis  semi-algorithm  which  is  parametrized  by 
a  set  of  cycles  in  the  transition  graph  of  the  system.  When  it  terminates,  our 
algorithm  returns  the  exact  set  of  successors  (or  predecessors)  of  a  given  CQDD 
representable  (or  CQDD  reverse  representable)  set  of  states.  Several  analysis  al¬ 
gorithms  can  be  derived  from  our  algorithm  by  determining  adequate  strategies 
for  choosing  the  set  of  cycles  to  be  considered  to  accelerate  the  fixpoint  calcula¬ 
tion.  The  algorithm  of  [5]  can  be  seen  as  a  particular  instance  of  our  algorithm. 

Related  work:  In  [16,  11]  a  model- checking  semi-algorithm  is  proposed  for  CFSM, 
based  on  a  finite  representation  of  the  state-graph  by  means  of  graph  grammars. 
This  approach  is  different  from  ours  since  it  is  based  on  a  finite  representation 
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of  the  state-graph  instead  of  a  finite  representation  of  the  set  of  states.  There 
are  other  existing  works  on  the  analysis  of  CFSM’s  assuming  that  the  systems 
have  lossy  or  unreliable  channels  (queues)  [1,  12],  In  our  work  we  do  not  have 
such  assumptions.  Other  works  propose  (terminating)  algorithms  generating  an 
upper  approximation  of  the  set  of  reachable  states  [15],  This  is  different  from 
our  approach  because  we  construct  the  exact  set  of  reachable  states  as  a  fixpoint 
calculation  and  helping  the  termination  of  this  calculation  by  exact  accelerations. 

The  rest  of  this  paper  is  organized  as  follows.  In  Section  2  we  introduce 
some  basic  definitions.  In  Section  3  we  define  CFSM’s  and  the  successors  and 
predecessors  functions.  In  Section  4,  we  define  CQDD’s  and  give  basic  results. 
In  Section  5,  we  show  how  CQDD’s  can  be  used  to  represent  nonregular  sets  of 
states  and  give  our  main  results  on  the  class  of  CQDD  representable  and  reverse 
representable  sets  of  states.  In  Section  6,  we  present  our  generic  forward  and 
backward  analysis  algorithm.  Finally,  we  conclude  in  Section  7,  Due  to  lack  of 
space  we  omit  the  proofs  of  the  theorems.  They  can  be  found  in  [8]. 

2  Preliminaries 

Preshurger  arithmetics  is  the  first  order  logic  of  natural  numbers  with  addition, 
subtraction  and  the  usual  ordering.  We  say  that  /  is  a  Presburger  formula  over  a 
set  of  variables  X  {xi, . . .  ,Xn},  and  we  write  /(X),  if  the  set  of  free  variables  in 
/  is  precisely  X.  The  semantics  of  Presburger  formulas  is  defined  in  the  standard 
way.  Given  a  formula  /  with  free  variables  X  —  {x\^ . . .  ^Xn)-,  and  a  valuation 
u  \  X  W,  we  say  that  v  satisfies  /,  and  write  u  \=  f,  if  the  evaluation  of  / 
under  i/  is  true.  We  say  that  a  formula  /  is  valid  if  every  valuation  satisfies  /. 

A  simple  automaton  over  X  (SA)  is  a  finite-state  automaton  A  =  {Q,qoy^ 

,  q.,n)  where  Q  is  a  finite  set  of  states  with  Q  =  {qo^Qit  -  • ,  Qm}  U  Pi  where 
Pi  ^  {p]^. . .  },  qo  (resp.  qm)  is  the  initial  (resp.  final)  state,  Q  x  X  x  Q 

is  a  set  of  transitions  (transition  relation)  defined  as  the  smallest  set  such  that  : 

1.  Vz  G  {0, . . . ,  m  -  1}.  3!a  e  X.  qi  qi+i, 

2.  Vz  e  {0, . . . ,  m},  if  Pj  /  0  then  3!a  £  X.  qi  -%  p],  3!a  G  X.  pf  A  qi 

and  Vj  e  {1, . . . ,  -  1}.  3!a  e  S.  pi  4  pi+\ 

3.  Vz  G  {0, . . . ,  m  —  1},  if  Pi  =  0  then  there  is  at  most  one  a  G  X  with  qi  A  qi 

4.  if  Prn  =  0  then  3X'  C  P.  Va  G  X' .  qm  A  qm, 

A  restricted  simple  automaton  (RSA)  is  a  simple  automaton  where  point  (4) 
in  the  definition  above  is  replaced  by:  (4').  if  P^  =  0  then  qm  has  no  successors. 

Notice  that  in  simple  automata,  the  out  degree  of  the  states  g^’s,  except  maybe 
q-m,  is  at  most  2,  whereas  the  outdegree  of  the  states  in  the  Pi  is  always  1. 
Each  state  different  from  qm  belongs  to  at  most  one  loop  which  is  of  the  form 
‘Pf  ^  qi-  We  say  that  qi  is  the  root  of  this  loop.  The  state  qm  has 
a  particular  status  since  it  may  be  the  root  of  several  loops,  but  in  this  case  all 
these  loops  must  be  self-loops.  In  RSA  qm  has  the  same  status  as  the  other  g^’s. 
Nondeterministic  choices  may  occur  only  at  the  states  qi.  A  simple  automaton 
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is  deterministic  if  every  state  qi  has  at  most  one  successor  by  each  symbol  in  S. 
We  write  DSA  (resp.  DRSA)  for  deterministic  SA  (resp.  RSA). 

Given  a  word  ic  —  ao  . . .  £  X'*,  a  run  of  A  over  ic  is  a  sequence  of  transitions 

p  =  {so,aQ,Si) . . .  (s^,a^,s^+i)  such  that  Sq  =  ^o-  The  run  p  is  accepting 

if  se+i  =  Qm-  The  language  accepted  by  A,  denoted  by  L(A),  is  the  set  of  words 
w  E  such  that  there  is  an  accepting  run  of  A  over  w. 

Notice  that  RSA’s  accept  languages  over  E  which  are  definable  by  regular 
expressions  of  the  form  uivlu2V2  •  •  where  the  Uj’s  and  the  Vi^  are 

words  over  E  such  that  only  ui  and  Um+\  rnay  be  empty.  SA’s  accept  words  of 
the  form  U1V1U2V2  ■  +  . . .  +  a^)*  where  the  a^’s  are  symbols  in  E. 

Let  A  =  be  a  simple  automaton.  Let  X  be  the  set  of  variable 

{aif  :  t  E-^}.  We  consider  for  each  run  p  of  A  the  valuation  Up  of  the  variable  in 
X  such  that,  for  every  xt  E  X,  I'pix)  —  \p\t.  Then,  we  define  a  Presburger  formula 
[A]  over  X  which  characterizes  the  set  of  valuations  corresponding  to  all  accepting 
runs  of  A.  For  that,  let  us  introduce  some  notations.  We  denote  by  T  the  set  of 
transitions  {t  E-^  :  3i  E  {0, . . . ,  m  —  1}.  3a  G  E.  t  =  (qi,a,qi+i)}.  For  each 
state  q  E  Q,  we  denote  by  In{q)  (resp.  Out{q))  the  set  of  transitions  of  the  form 
(resp.  {q,a,q'))  for  some  q'  E  Q  and  a  E  E.  Now,  let  [A]  be  the  formula 

iAteT^^  =  1)  A  (AqeQ\{qo}  '^tein{q)  ~  ^teOut{q)  A  (1  +  '^tein{qo)  ~ 
'^teOxitiqo)  checked  that  for  each  valuation  ly  of  the  variables  in  X, 

ly  satisfies  [A]  if  and  only  if  there  exists  an  accepting  run  p  of  A  such  that  v  =  Vp. 

It  is  well  known  that  every  finite-state  automaton  has  a  characteristic  Pres¬ 
burger  formula  due  to  Parikh’s  theorem  [14].  However,  the  formula  we  give  above 
is  simpler  and  exploits  the  particular  structure  of  simple  automata. 


3  Communicating  Finite-State  Machines 

We  consider  a  generalization  of  communicating  finite-state  machines  (CFSM) 
defined  in  [4].  A  CFSM  is  a  finite-state  machine  which  can  send  and  receive 
messages  over  a  finite  set  of  unbounded  FIFO  queues.  Usually,  a  transition  either 
appends  a  message  to  the  end  of  a  queue  or  removes  a  message  from  the  head  of 
a  queue.  We  generalize  this  by  allowing  simultaneously  appending  and  removing 
messages  from  several  queues. 

Formally,  a  Communicating  Finite- State  Machine  A4  is  a  tuple  {S,K,  E,T) 
where  5  is  a  finite  set  of  control  states,  X  is  a  finite  set  of  unbounded  FIFO  queues, 
X  is  a  finite  set  of  messages,  T  is  a  finite  set  of  transitions.  Each  transition  is 
of  the  form  (si,op, S2),  where  si  and  S2  G  S,  and  op  is  a  finite  set  of  queue 
operations  of  the  form  Ki\w  or  kCw  with  Ki  E  K  and  w  E  E*  such  that  for  each 
queue  there  is  at  most  one  label  Kilw  or  in  op. 

A  configuration  of  is  a  tuple  7  —  {s,w)  where  s  is  a  control  state  in  S, 
and  w  =  {wi, . .  .,w\x\)  is  a  |X|-dim  multi-word  (i.e.,  a  tuple  in  (X*)l^l),  each 
Wi  being  the  contents  of  the  queue  tzi,  for  i  E  {1, . . . ,  |X|}.  We  denote  by  Conf 
the  set  of  all  configurations  of  M,  i.e.,  Conf  =  S  x 

We  define  a  global  transition  relation  between  configurations  in  the  following 
manner:  Let  7  =  (s,wi, . . .  ,w\x\)  and  7'  =  (s', . . . , be  two  configura- 
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tions,  and  let  op  be  a  set  of  queue  operations.  Then,  we  have  7  -?  7'  if  and  only 
if  there  exists  a  transition  (si,  op,  S2)  6  T  such  that,  for  every  i  G  {1, . . . ,  \K\}, 

-  if  Ki?w  e  op  then  ww^  =  Wi  else  if  Ki\w  €  op  then  w-  -  WiW, 

-  otherwise  w[  =  Wi. 

Given  a  transition  r  =  (5,  op,  s')  G  T,  we  say  that  r  is  executable  at  7  =  (5,  w) 
if  there  exists  7'  (s',w')  such  that  7  ^  7'.  In  this  case,  7'  (resp.  7)  is  the  imme¬ 

diate  successor  (resp.  predecessor)  of  7  (resp.  7')  by  r.  We  define  the  predecessor 
and  successor  functions  pre^  and  post^,  both  in  — )■  such  that,  for 

every  set  of  configurations  C,  pre^{C)  (resp.  post^iC))  is  the  set  of  immediate 
predecessors  (resp.  successors)  of  the  configurations  in  C  by  r.  The  pre  (resp. 
post)  function  is  defined  as  the  union  of  the  functions  pre^  (resp.  post^),  for  all 
T  T.  The  notion  of  executability  can  be  generalized  to  sequences  of  transitions, 
in  particular  to  cycles  in  the  transition  graph  T.  A  sequence  0  of  transitions  in 
T  is  called  a  cycle  if  it  is  of  the  form  (sq,  opo,  ^^(si,  opi,  S2)  •  •  •  (sn-i,  op^,  5o). 

The  definitions  of  pre^  and  post^  can  also  be  generalized  to  sequences  of 
transitions:  pre^^  =  P^^n  o  ■  •  •  ~  o  . . .  o  post^^ . 

Given  a  sequence  of  transitions  6,  the  functions  prej  and  post^  are  the  re¬ 
flexive  transitive  closures  of  prcg  and  post^,  i.e.  given  a  set  of  configuration  C, 
preJ(C)  (resp.  po5iJ(C7))  is  the  set  of  predecessors  (resp.  successors)  of  configu¬ 
rations  in  C  obtained  by  iterating  an  arbitrary  number  of  times  6. 

We  define  the  functions  pre*  and  post*  as  the  reflexive  transitive  closures  of 
pre  and  post.  The  function  pre*  (resp.  post*)  yields  the  set  of  all  predecessors 
(resp.  successors)  of  a  given  set  of  configurations. 

4  Constrained  Queue  Description  Diagrams 

In  this  section  we  introduce  representation  structures  for  sets  of  queue  contents. 
These  structures  consist  of  a  combination  of  finite-state  automata  (restricted 
deterministic  simple  automata)  with  linear  constraints  on  the  number  of  times 
transitions  in  these  automata  are  taken.  This  combination  allows  to  represent 
nonregular  sets  of  queue  contents. 

4.1  Definition 

Constrained  Queue  Description  Diagrams  (CQDD’s)  are  a  particular  case  of  con¬ 
strained  simple  automata.  For  any  n  >  1,  a  n-dim  constrained  simple  automa¬ 
ton  (CSA)  is  a  set  of  accepting  components  C  =  {{Ai,fi), . .  - ,  {Am,  /m))  where, 
for  every  i  G  {1, . . . ,  m},  Ai  is  a  tuple  of  n  simple  automata  (4| , . . . ,  over 
U  and  fi  is  a  Presburger  formula  over  a  set  of  variables  Vi  containing  the  set 
Xi  =  {xt  :  f  €  7i},  where  %  is  the  set  of  all  the  transitions  of  the  automata  in 

A,  i.e.,  7^  =  U;=.i  -^5- 

The  CSA  C  accepts  a  n-dim  multi-language,  i.e.,  a  set  of  tuples  of  n  words. 
For  every  i  G  {1, . . . ,  m},  the  multi-language  of  the  accepting  component  {Ai,  fi), 
denoted  by  L{{Ai,  fi))  is  the  set  of  tuple  of  words  {wi,...,  Wn)  €  (17*)”  for  which 
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there  are  accepting  runs  (pi, . .  •  ,Pn)  of  the  automata  (A}, . . .  .Af)  respectively, 
such  that  3{Vi  \  X^).  fi  is  satisfied  by  the  valuation  (i/p^ ,  •  •  • ,  ^^p„)  (he.,  the  valu¬ 
ation  associating  with  each  variable  xt  the  integer  |pi . . .  pn\t,  where  t  €  71).  The 
multi-language  of  the  CSA  C,  denoted  by  L(C),  is  the  union  Ui-i  fi))- 

A  n-dim  CDSA  is  a  n-dim  CSA  such  that  all  its  automata  are  determinis¬ 
tic.  A  n-dim  CQDD  is  a  n-dim  CDSA  such  that  all  its  automata  are  restricted 
(DRSA’s).  For  every  n  >  1,  we  denote  by  n-CQDD  (resp.  n-CDSA)  the  class  of 
all  n-dim  CQDD’s  (resp.  n-dim  CDSA’s).  We  say  that  a  n-dim  multi-language  C 
is  CQDD  (resp.  CDSA)  definable  if  there  exists  a  n-dim  CQDD  (resp.  CDSA)  C 
such  that  L{C)  =  C.  A  n-dim  multi-language  C  is  CQDD  (resp.  CDSA)  reverse 
definable  if  its  reverse  image,  denoted  by  is  CQDD  (resp.  CDSA)  definable. 


4.2  Expressiveness 

CQDD’s  allow  to  define  nonregular  multi-languages.  For  instance,  consider  the 
context-sensitive  language  h  =  {a"h’"o’*6’"  :  n,m  >  1}.  To  define  h,  we  use 
the  automaton  Ai  represented  by  the  following  picture: 


go  gi  92  gs  94 


Then,  L,  is  defined  by  the  1-dim  CQDD  {(Ai,/i>}  where  fi  is  given  by 
=  a:(,3,a.,3)  A  Consider  the  2-dim  multi-language 

L2  =  {{a^b^a^b'^ ,c^d^a'^)  :  n,m>  1}.  To  define  this  multi-language,  we  use 
two  automata,  the  automaton  Ai  above  and  A2  given  by  the  following  picture: 


c  d  a 


Then,  L2  is  defined  by  the  2-dim  CQDD  {((Ai,  A2),  /2)}  where  /2  is  given  by 

(^(qi , a, <7l)  “^(93,0.93)  =^(<?2>^-92)  ~^)  ^  (^(9'2,i>,q2)  =^(94.^>,q4)  ^  (^g  )  ” '^) ' 

These  examples  show  that  CQDD’s  can  be  used  to  express  nonregular  multi¬ 
languages  involving  constraints  on  number  of  occurrences  of  symbols  at  some 
positions  that  may  be  in  a  same  word  (as  in  Li),  or  even  in  different  words  (as  in 
1/2 ).  This  allows  to  represent  sets  of  queue  contents  such  that  there  are  counting 
constraints  relating  the  contents  of  different  queues. 


4.3  Basic  operations  and  decision  problems 

Here  we  give  the  main  results  about  boolean  operations  on  CQDD’s.  We  show 
that  they  are  closed  under  union,  intersection,  concatenation  and  left-derivation, 
but  their  complementation  yields  CDSA’s.  Concatenation  and  left-derivation  are 
useful  operations  in  the  construction  of  sets  of  successors  and  predecessors  (see 
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Section  5).  Moreover,  the  intersection  of  a  CQDD  with  a  CDS  A  is  a  CQDD. 
Finally,  we  show  that  the  membership  and  inclusion  are  decidable  for  CQDD’s. 

Let  Cl  and  £2  be  two  n-dim  multi-languages.  The  concatenation  of  Ci  and 
C2,  denoted  by  £1  •  £2,  is  the  set  {w  E  :  3u  £  Ci.  3v  E  C2.  w  =  uv} 

where  uv  is  the  component-wise  concatenation  of  u  and  v.  The  left- derivative  of 
£1  by  £2,  denoted  by  C^^  •  £1,  is  the  set  {w  E  E  £2.  w'w  E  £1}, 

i.e.,  the  set  of  multi- words  allowing  to  extend  elements  of  £2  to  elements  of  £1. 

Proposition  4.1  For  every  n>l,  n-CQDD  is  closed  under  union,  intersection, 
concatenation  and  left- derivation. 

It  can  be  observed  that  the  product  of  a  DRSA  with  a  DSA  is  a  DSA.  Hence: 

Proposition  4.2  For  every  n  ^  1,  the  intersection  of  an  n-CQDD  with  an  n- 
CDSA  is  an  n-CQDD. 

Because  the  simple  automata  in  a  CQDD  are  deterministic  we  can  show: 

Proposition  4.3  For  every  n>l,  the  complement  of  a  n-CQDD  is  a  n-CDSA. 

Let  C  =  {(A  J)}  he  a,  CSA.  Clearly,  L{C)  7^  0  if  and  only  if  the  Presburger 
formula  [A]  A  /  is  satisfiable.  Hence: 

Proposition  4.4  The  emptiness  problem  is  decidable  for  CSA ’s. 

From  Propositions  4.2,  4.3,  and  4.4  we  deduce: 

Corollary  4.1  For  every  n>l,the  membership  problem  as  well  as  the  inclusion 
problem  are  decidable  for  n-dim  CQDD’s. 

5  Representing  and  manipulating  sets  of  configurations 

Let  M  -  {S,K,D,T)  be  a  CFSM.  Every  set  of  configurations  C  C  Conf  can 
be  written  as  a  union  ^  where  the  £s’s  are  |i^|-dim  multi-languages. 

We  say  that  C  is  CQDD  representable  (resp.  reverse  representable)  if  for  every 
s  E  S,  the  multi-language  Cs  is  CQDD  definable  (resp.  reverse  definable).  Let  us 
consider  as  an  example  the  system  M: 

Av2!6} 

{K2\a,Ks\a} 

The  set  of  configurations  of  M  reachable  from  the  (so,e,e)  is  given  by: 

{so}x{{a^,{ba)^,a^)  :  n,m  >  0}U{si}  x  {{a^,iba)^b,a^)  :  n,m>0}  (1) 
and  is  clearly  CQDD  representable. 

In  the  sequel,  we  present  results  allowing  to  manipulate  and  to  reason  about 
sets  of  configurations  that  are  CQDD  representable  or  reverse  representable.  First 
of  all,  by  Propositions  4.1,  4.4  and  Corollary  4.1,  we  deduce: 
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Theorem  5.1  The  class  of  CQDD  representable  (resp.  reverse  representable) 
sets  of  configurations  is  effectively  closed  under  union  and  intersection,  and  has 
decidable  emptiness,  membership  and  inclusion  problems. 

The  closure  property  under  concatenation  and  left-derivation  of  CQDD’s 
(Proposition  4.1)  allows  us  to  show: 

Theorem  5.2  For  every  CQDD  representable  (resp.  reverse  representable)  set 
of  configurations  C ,  the  set  of  configurations  post{C)  (resp.  pre{C))  is  CQDD 
representable  (resp.  reverse  representable)  and  effectively  constructible. 

Now,  we  give  our  main  results. 

Theorem  5.3  For  every  CQDD  representable  set  of  configurations  C,  and  every 
cycle  e,  the  set  of  configurations  postl{C)  is  CQDD  representable  and  effectively 
constructible. 

We  give  hereafter  a  rough  scheme  of  the  proof:  Let  C  be  a  set  configurations 
given  by  a  m-dim  CQDD  of  the  form  {{{Ai  (this  is  not  a  restriction 

since  post  is  distributive  w.r.t.  union).  The  principle  of  the  construction  is  to 
compute  the  effect  of  n  successive  executions  of  the  cycle  0  on  each  queue,  n 
being  a  parameter. 

Then,  for  every  i  e  {1, . . .  ,m},  we  construct  several  automata  A[  and  con¬ 
straints  gi.  These  constraints  depend  on  n  (considered  as  a  new  free  variable), 
and  relate  variables  corresponding  to  the  transitions  of  A'^  with  those  correspond¬ 
ing  to  the  transitions  of  Ai.  The  set  post0{C)  is  then  represented  by  a  union  of 
CQDD’s  {((A;,  . . . ,  /  A  AIli  9i)}’  Note  that,  since  all  the  gfs  depend  on 

the  variable  n,  this  expresses  the  fact  that  the  number  of  executions  of  9  must 
be  the  same  for  every  queue. 

The  construction  of  the  automaton  A  •  and  the  constraint  gi  is  done  by  iden¬ 
tifying  the  configurations  from  which  the  cycle  6  can  be  executed  an  unbounded 
number  of  times  and  those  allowing  only  a  bounded  number  of  executions.  Then, 
we  show  that  in  each  case,  A'  and  gi  are  obtained  from  Ai  using  basic  operations 
on  CQDD’s  such  as  concatenation  and  left-derivation. 

We  can  also  prove  that  the  class  of  CQDD  reverse  representable  sets  of  con¬ 
figurations  is  effectively  closed  under  the  pre^  function,  for  every  cycle  6. 

Theorem  5.4  For  every  CQDD  reverse  representable  set  of  configurations  C, 
and  every  cycle  9,  the  set  of  configurations  preJ(C)  is  CQDD  reverse  repre¬ 
sentable  and  effectively  constructible. 


6  Forward  and  backward  reachability  analysis 

The  basic  (safety)  verification  problem  consists  in  checking  that  a  bad  configu¬ 
ration  can  never  be  reached  from  an  initial  configuration.  Thus,  given  a  set  of 
initial  configurations  I  and  a  set  of  bad  configurations  B,  this  problem  can  be 
formulated  either  as 
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-  (PI)  B  n  post*{I)  =  0,  or 

-  (P2)  lnpre*{B)  =  0. 

The  first  formulation  consists  of  a  forward  reachability  analysis  of  the  configura¬ 
tion  space  whereas  the  second  one  consists  of  a  backward  reachability  analysis. 
Hence,  given  a  set  of  configurations  (7,  we  wish  to  compute  the  set  of  its  successors 
and  predecessors,  i.e.,  post*{C)  and  pre*{C).  By  definition,  for  (j)  €  {post^pre}, 
we  have 

*>o 

where 


Co-C 

Ci+i  =  Ci\J  (j){Ci)  for  every  i  >  0. 

In  the  case  (j)  =  post  (resp.  0  =  pre),  if  C  is  CQDD  representable  (resp.  reverse 
representable),  it  can  be  deduced  from  Theorems  5.1  and  5.2  that  all  the  Ci  s  are 
CQDD  representable  (resp.  reverse  representable).  Hence,  the  equations  above 
yield  a  semi-algorithm  for  calculating  (l>*{C)  based  on  a  iterative  calculation  of 
the  Ci’s.  Since  the  sequence  of  the  C^’s  is  increasing,  the  limit  is  reached  if  for 
some  i  we  have  Ci  -  C^+i.  Then  the  algorithm  stops  and  returns  Ci,  We  can 
detect  this  since  the  inclusion  problem  is  decidable  for  CQDD  representable  (resp. 
reverse  representable)  set  of  configurations  (by  Theorem  5.1).  Then,  if  the  set  of 
initial  and  bad  states  is  also  CQDD  representable  (resp.  reverse  representable), 
the  problem  PI  (resp.  P2)  above  can  be  solved  by  Theorem  5.1. 

Of  course,  since  the  reachability  problem  is  undecidable  for  CFSM’s,  an  index 
i  such  that  Q  ==  Ci+i  does  not  exist  in  general,  and  the  (naive)  algorithm 
described  above  may  never  stop. 

We  propose  to  tackle  this  divergence  problem  by  performing  an  ''exact  ac¬ 
celeration'^  of  the  iterative  calculation  of  the  limit  0*((7).  The  idea  is  as  follows: 
Given  a  set  of  cycles  in  the  transition  graph  of  the  system,  say  0,  add  at  each 
step  the  set  of  successors  (or  predecessors)  by  each  of  the  cycles  in  6.  This  op¬ 
eration  is  sound  (exact)  since  all  the  added  configurations  belongs  to  (t>*{C),  So, 
we  compute  (t)*(C)  as  the  limit  of  another  increasing  sequence  of  configurations 
{Di)i>Q  given  by: 

Do  =  0 

Dj+i  =  Di  U  (l){Di)  U  IJ  (l>l{Di)  for  every  i  >  0 

9^0 

Clearly,  for  every  z  >  0,  we  have  Ci  C  Di.  Hence,  the  chance  to  reach  the  limit 
(j)*  (C)  in  a  finite  number  of  steps  is  greater  (or  at  least  equal)  by  considering  the 
Di's  instead  of  the  C^s,  and  this  chance  should  increase  with  the  size  of  0. 

Therefore,  using  Theorems  5.1,  5.2,  5.3  and  5.4,  we  obtain  a  generic  reacha¬ 
bility  analysis  semi-algorithm  which  computes  (when  it  terminates)  the  exact  set 
of  successors  (resp.  predecessors)  of  a  given  CQDD  representable  (resp.  reverse 
representable)  set  of  configurations.  This  algorithm  is  given  by: 
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Reachability  (0,  C): 

X  :=C  ; 

repeat 

Y  •—  X  ' 

X-x'l}cj>{X)\J\je^e>re{X) 

until  X  —  Y  ] 

return{X) 
end  Reachability 

A  variety  of  reachability  algorithms  can  be  derived  from  the  generic  algorithm 
above  by  determining  adequate  strategies  for  choosing  the  set  of  cycles  0. 

For  instance,  the  forward  reachability  analysis  algorithm  given  in  [5]  can  be 
seen  as  a  possible  instance  of  our  algorithm^.  Indeed,  in  [5]  the  authors  con¬ 
sider  the  set  of  cycles  that  are  of  one  of  the  following  three  forms:  (s,  s), 

(s,  {«:?«;},  s),  or  (s,  {ki?w;},  5).  These  kind  of  cycles  do  not  intro¬ 

duce  counting  constraints  on  queue  contents.  Hence,  starting  from  a  regular  set 
of  configurations  (finite-state  automata  definable),  the  set  of  reachable  configura¬ 
tions  by  these  cycles  is  also  regular.  Then,  a  representation  structure  based  only 
on  finite-state  automata  (QDD’s)  can  be  used  and  allows  to  analyze  some  signifi¬ 
cant  systems.  But  considering  QDD’s  and  only  cycles  of  the  form  specified  above 
does  not  allow  to  reason  about  systems  with  nonregular  sets  of  configurations  like 
the  system  M.  given  in  Section  5.  However,  it  is  easy  to  see  that  our  algorithm 
terminates  and  computes  the  exact  set  of  configurations  of  the  system  M  (given 
by  1)  if  we  consider  as  0  the  set  of  the  two  elementary  cycles  (so,  So), 

and  (so,  ^o)}- 


7  Conclusion 

We  have  applied  the  symbolic  analysis  principle  to  fifo-channel  systems  (commu¬ 
nicating  finite  state  machines).  These  systems  have  in  general  nonregular  sets  of 
configurations.  We  have  proposed  a  representation  structure  for  their  sets  of  con¬ 
figurations  combining  finite-state  automata  with  counting  constraints  expressed 
in  Presburger  arithmetics.  We  have  shown  that  this  structures  allow  to  com¬ 
pute  the  exact  effect  of  the  repeated  execution  of  any  fixed  cycle  in  the  transition 
graph  of  a  system.  We  have  defined  a  generic  reachability  analysis  semi- algorithm 
which  is  parametrized  by  a  set  of  cycles.  This  semi- algorithm  computes  iteratively 
the  set  of  successors  (or  predecessors)  by  considering  these  cycles  as  additional 
“meta-transitions”  in  the  graph,  following  the  approach  adopted  in  [6,  5]. 

It  can  be  seen  that  our  reachability  analysis  procedure  computes  a  fixpoint  of 
a  function  on  set  of  configurations  which  is  of  a  very  particular  form.  Actually, 
this  procedure  can  be  generalized  to  a  model-checking  procedure  for  any  positive 
fixpoint  formula  constructed  using  disjunctions,  conjunctions,  and  the  successor 
(predecessor)  function,  starting  from  basic  CQDD  (reverse)  representable  sets. 

^  In  the  definition  of  QDD’s,  any  deterministic  finite-state  automata  can  be  used.  How¬ 
ever,  it  can  be  checked  that,  starting  from  an  initial  configuration  with  empty  queues, 
the  constructed  QDD  is  a  union  of  DRSA’s. 
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Abstract.  Milner  proposed  an  axiomatization  for  the  Kleene  star  m  basic  pro¬ 
cess  algebra,  in  the  presence  of  deadlock  and  empty  process,  modulo  bisimulation 
equivalence.  In  this  paper,  Milner’s  axioms  are  adapted  to  no-exit  iteration 
which  executes  x  infinitely  many  times  in  a  row,  and  it  is  shown  that  this  axio¬ 
matization  is  complete  for  no-exit  iteration  in  basic  process  algebra  with  deadlock 
and  empty  process,  modulo  bisimulation. 


1  Introduction 

Kleene  [15]  defined  a  binary  operator  x*y  in  the  context  of  finite  automata,  which  denotes 
the  iterate  of  x  on  y.  Intuitively,  the  expression  x*y  can  choose  to  execute  either  x, 
after  which  it  evolves  into  x*y  again,  or  y,  after  which  it  terminates.  A  feature  of  the 
Kleene  star  is  that  on  the  one  hand  it  can  express  recursion,  while  on  the  other  hand 
it  can  be  captured  in  equational  laws.  Hence,  one  does  not  need  meta-principles  such 
as  the  Recursive  Specification  Principle  [10].  Kleene  formulated  several  equations  for  his 
operator,  notably  the  defining  equation  x*y  =  x{x*y)  +  y.  In  later  years  it  became  more 
fashionable  to  consider  the  unary  version  x*  of  the  Kleene  star.  In  the  presence  of  the 
empty  process,  the  unary  and  the  binary  Kleene  star  are  equally  expressive. 

Salomaa  [22]  presented  a  finite  complete  axiomatization  for  the  Kleene  star  in  language 
theory,  modulo  completed  trace  equivalence,  which  incorporates  one  conditional  axiom, 
namely,  if  x  =  y  x  +  z,  and  y  cannot  terminate  immediately,  then  x  =  y*z.  Salomaa’s 
completeness  proof  basically  consists  of  two  steps:  first  he  shows  that  the  solutions  of  a 
guarded  recursive  specification  are  all  provably  equal  to  the  same  term,  and  next  he  shows 
that  if  two  terms  are  completed  trace  equivalent,  then  there  exists  a  guarded  recursive 
specification  for  which  both  terms  are  solutions. 

Milner  [17]  was  the  first  to  study  the  (unary)  Kleene  star  modulo  bisimulation,  and 
proposed  an  axiomatization  for  it,  being  an  adaptation  of  Salomaa’s  axiom  system.  Milner 
[17,  page  461]  raised  the  question  whether  his  axiomatization  is  complete  for  the  Kleene 
star  in  process  theory,  and  remarked  that  this  question  may  be  hard  to  answer:  “The 
difficulty  is  that  the  method  [...]  of  Salomaa’s  original  completeness  proof  cannot  be 
applied  directly,  since  -in  contrast  with  the  case  of  languages-  an  arbitrary  system  of 
guarded  equations  [...]  cannot  in  general  be  solved  in  star  expressions  . 

In  this  paper  the  instantiation  x*S  of  the  binary  Kleene  star  is  studied,  which  carries 
two  names:  perpetual  loop  and  no-exit  iteration.  Since  the  deadlock  (5  blocks  the  exits, 
this  construct  executes  x  an  infinite  number  of  times  in  a  row.  The  perpetual  loop  is 
closely  related  to  the  Kleene  star,  and  shares  several  of  its  characteristics.  In  this  paper 
no-exit  iteration,  which  is  denoted  by  is  studied  in  Basic  Process  Algebra  [9]  with 
deadlock  and  empty  process,  denoted  by  BPA^JA).  No-exit  iteration  can  be  used  to 
formally  describe  programs  that  repeat  a  certain  procedure  without  end.  A  significant 
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advantage  of  iteration  over  recursion  as  a  means  to  express  infinite  processes  is  that  does 
it  not  involve  a  parametric  process  definition,  because  the  development  of  process  theory 
is  easier  if  parametrization  does  not  have  to  be  teiken  as  primitive  (see  e.g.  Milner  [18,  page 
212]).  Since  the  syntax  of  process  algebra  with  iteration  has  an  inductive  term  structure, 
it  allows  simpler  axiomatizations  than  recursion,  and  it  does  not  need  a  guardedness 
restriction  to  locate  the  class  of  meaningful  terms.  Therefore,  the  Kleene  star  is  used 
for  example  in  the  specification  and  verification  of  Grid  protocols  [7],  which  describe 
parallel  computations  in  a  grid-like  axchitecture,  and  in  the  ToolBus  [8],  which  enables 
to  link  separate  tools.  In  both  cases,  iteration  is  used  almost  exclusively  in  the  form  of 
the  perpetual  loop.  No-exit  iteration  is  also  used  in  the  educational  vein  [21],  because  it 
enables  to  specify  and  verify  infinite  processes  in  a  simple  and  intuitive  way. 

The  three  axioms  for  the  unary  Kleene  star  in  Milner’s  axiom  system  (being  Kleene’s 
defining  equation,  Salomaa’s  conditional  axiom  and  an  equation  which  describes  the  in¬ 
terplay  of  Kleene  star  and  empty  process)  have  obvious  counterparts  for  no-exit  iteration. 
It  turns  out  that  these  three  axioms,  together  with  the  standard  axioms  for  BPA^g(v4), 
make  a  complete  axiomatization  for  BPA5'^(A)  modulo  bisimulation.  The  completeness 
proof  is  based  on  a  strategy  that  originates  from  [11].  It  also  uses  new  techniques,  which 
will  hopefully  turn  out  to  be  applicable  in  a  possible  proof  of  Milner’s  conjecture  (see 
Section  4  for  a  discussion  on  this  topic).  For  a  detailed  presentation  of  the  completeness 
proof  for  BPAJ'g.(A),  and  for  omitted  proofs  in  this  paper,  the  reader  is  referred  to  [12]. 

This  paper  focuses  on  the  process  algebra  BPAJ" (A),  in  which  the  empty  process  is  not 
present.  This  setting  allows  a  more  concise  presentation  of  the  ideas  that  are  used  in  the 
completeness  proof  for  the  perpetual  loop  in  process  algebra.  We  will  see  that  Kleene’s 
defining  equation  and  Salomaa’s  conditional  axiom  for  the  the  perpetual  loop,  together 
with  the  standard  axioms  for  BPAi(A),  are  complete  for  BPAJ'(A)  modulo  bisimuiation. 

Sewell  [23]  proved  that  there  does  not  exist  a  complete  finite  equational  axiomatization 
for  the  Kleene  star  in  combination  with  deadlock  modulo  bisimulation,  due  to  the  fact 
that  is  bisimilar  to  (a”)'*'  for  n  =  1,  2, ....  Since  these  equivalences  are  also  present  in 
BPA^^,  Sewell’s  argument  can  be  copied  to  conclude  that  there  does  not  exist  a  complete 
finite  equational  axiomatization  for  BPA‘^(A).  Hence,  the  adaptation  of  Salomaa’s  con¬ 
ditional  axiom  for  the  perpetual  loop  is  essential  for  the  obtained  completeness  results. 

The  requirement  "y  cannot  terminate  immediately’  in  Salomaa’s  conditional  axiom  can 
be  defined  inductively  on  the  syntax.  According  to  Kozen  [16]  this  requirement  is  not 
algebraic,  in  the  sense  that  it  is  not  preserved  under  substitution  of  terms  for  actions. 
He  proposed  two  alternative  conditional  axioms  which  do  not  have  this  drawback.  These 
axioms,  however,  are  not  sound  with  respect  to  bisimuiation  equivalence. 

Bergstra,  Bethke  and  Ponse  [6]  suggested  a  finite  equational  axiomatization  for  BPA*, 
i.e,  for  basic  process  algebra  with  the  binary  Kleene  star  without  the  special  constants  5 
and  e,  modulo  bisimuiation.  Their  conjecture  that  it  is  complete  was  solved  by  Fokkink 
and  Zantema  [14].  (In  contrast  with  this  result,  Aceto,  Fokkink  and  Ingolfsdottir  [3] 
showed  that  there  does  not  exist  a  complete  finite  equational  axiomatization  for  BPA* 
modulo  any  process  semantics  in  between  ready  simulation  and  completed  traces.)  In 
[11],  a  new  proof  for  the  completeness  result  from  [14]  was  presented.  This  new  proof 
technique  was  was  applied  successfully  not  only  in  this  paper,  but  also  in  a  paper  on  a 
restricted  version  of  iteration  called  prefix  iteration,  which  is  better  suited  for  a  setting 
with  prefix  multiplication  or  with  communication  [2] ,  and  in  a  paper  on  a  more  expressive 
variant  of  iteration  called  multi-exit  iteration  [1]. 
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2  The  Perpetual  Loop  in  Process  Algebra 

2.1  Syntax 

We  assume  a  non-empty  alphabet  A  of  atomic  actions,  with  typical  elements  a,  6,  c.  We 
also  assume  two  special  constants  S,  which  represents  deadlock,  and  e,  which  represents 
the  empty  process,  and  ^  ranges  over  .4  U  {S,e}.  Furthermore,  we  have  two  binary  op¬ 
erators:  alternative  composition  x  4*  t/,  which  combines  the  behaviours  of  x  and  y,  and 
sequential  composition  x-y,  which  puts  the  behaviours  of  x  and  y  in  sequence.  Finally,  we 
have  the  unary  operator  x‘^ ,  which  executes  x  infinitely  many  times  in  a  row.  We  will  refer 
to  this  operator  both  as  perpetual  loop  and  as  no-exit  iteration.  The  language  BPAJ'^(.4), 
with  typical  elements  p,g,  consists  of  all  the  terms  that  can  be  constructed  from 
the  atomic  actions,  the  two  special  constants,  the  two  binary  composition  operators,  and 
the  perpetual  loop.  That  is,  the  BNF  grammar  for  the  collection  of  process  terms  is: 

p  ::=  a\5\e\p-\-p\p-p\p^> 

BPAJ'(A)  is  obtained  by  deleting  the  empty  process  £,  and  BPA"^(A)  is  obtained  by  delet¬ 
ing  the  deadlock  S  and  the  empty  process  £  from  the  syntax.  The  sequential  composition 
operator  will  often  be  omitted,  so  pq  denotes  p  ■  q.  As  binding  convention,  alternative 
composition  binds  weaker  than  sequential  composition  and  no-exit  iteration. 

Remark:  The  presence  of  the  special  constant  S  in  BPA^^(A)  is  redundant,  because  it  can  be 
expressed  in  BPA^(A)  modulo  bisimulation:  is  bisimilar  with  6,  because  both  processes  do 

not  exhibit  any  behaviour.  However,  S  is  maintained  in  the  syntax  as  a  standard  abbreviation. 


2.2  Operational  Semantics 

Table  1  presents  an  operational  semantics  for  BPA^JA)  in  Piotkin  style  [20],  where 
X  x'  represents  that  process  x  can  evolve  into  process  x'  by  the  execution  of  action 
a,  and  x  y/  denotes  that  process  x  can  terminate  by  the  execution  of  action  a,  and 
the  unary  predicate  x  y/  denotes  that  process  x  can  terminate  immediately. 


V 


x  +  y  -U-  y/  y  +  x  — y  y/ 

y 


x-y-y  x'  y-\-x 


x-y  y/ 


X  .  y  - y 


X.y  -^y  x-y  x'  -  y 


V 


x'  •  (x‘*') 


Table  1.  Transition  rules  for  BPA5g(A) 
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Definition  1.  p'  is  a  derivative  of  p  if  p  can  evolve  into  p'  by  zero  or  more  transitions. 
p'  is  a  proper  derivative  of  p  if  p  can  evolve  into  p'  by  one  or  more  transitions. 

Note  that  a  process  term  can  be  a  proper  derivative  of  itself,  for  example,  a*b  a*b. 
In  the  sequel,  p'  and  p”  will  denote  derivatives  of  process  term  p.  The  following  lemma 
can  easily  be  deduced,  using  structural  induction. 

Lemma  2.  Each  process  term  in  BPAJ'^(.4)  has  only  finitely  many  derivatives. 

Process  terms  are  considered  modulo  bisimulation  equivalence  from  Park  [19].  Intuitively, 
two  processes  are  bisimilar  if  they  have  the  same  branching  structure. 

Definition  3.  Two  processes  p  and  q  are  bisimilar,  denoted  by  p  44  g,  if  there  exists  a 
symmetric  binary  relation  B  on  processes  which  relates  p  and  q,  such  that: 

-  if  r  B  s  and  r  r',  then  there  is  a  transition  s  s'  such  that  r'  B  s'; 

-  if  r  B  s  and  r  i/,  then  s  — >  i/. 

Bisimulation  equivalence  is  a  congruence  with  respect  to  all  the  operators,  which  means 

that  if  p  44  p'  and  q±tq',  then  p+g  ±±  p' +g'  and  pg  44  p'q'  and  p^  ±±  {p')^.  Namely,  the 

transition  rules  in  Table  1  are  in  the  ‘path’  format,  which  guarantees  that  the  generated 
bisimulation  equivalence  is  a  congruence,  see  [5,  13]. 


2.3  Axiomatizations 

Table  2  presents  the  standard  axioms  Al-9  for  BPAje(A).  Furthermore,  Table  3  contains 
the  defining  equation  NEIl  together  with  the  conditional  axiom  RSP‘^  for  the  perpetual 
loop.  The  axiomatization  Al-T+NEIl+RSP^^  is  sound  for  BPA^(i4),  i.e.,  if  p  =  g  in 
BP.4j  {A)  is  provable  from  these  axioms,  then  p  ±±q.  Since  bisimulation  equivalence  is 
a  congruence  for  BPAJ’(A),  soundness  can  be  verified  by  checking  this  property  for  each 
axiom  separately,  which  is  left  to  the  reader. 


A1 

A2 

A3 

A4 

A5 

A6 

A7 

A8 

A9 


X  +  y  =  y  +  X 
{x  +  y)+z  =  x  +  (y  +  z) 
x  +  X  =  X 

{x  +  y)-z  =  x-  z  +  y-  z 
{x-y)-  z  =  x(yz) 

X  +  6  =  X 

5  ■  x  =  6 

X  ■  £  —  X 
£  ■  X  =  X 


Table  2.  The  axioms  for  BPA^e(A) 


NEIl  x-(x^)  =x‘^ 

RSP*^  X  =  y  ■  x  ==>  x  =  y'^ 


Table  3.  The  axioms  for  the  perpetual  loop  in  the  absence  of  e 
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However,  the  axiom  RSP'^  is  not  sound  in  the  presence  of  the  empty  process.  Namely, 
due  to  the ’axiom  A9,  x  ^  ex,  it  then  implies  x  =  which  is  clearly  unsound.  There¬ 
fore,  in  Table  4  an  adaptation  RSP^  is  introduced,  where  the  condition  y  J/  expresses 
that  y  cannot  terminate  immediately.  This  condition,  which  is  similar  to  the  so-called 
guardedness  restriction  in  the  Recursive  Specification  Principle  from  Bergstra  and  Klop 
[10],  can  be  defined  unductively  on  the  syntax; 


a  Y 
5Y 

X  Y Ay  {x -h  y)  Y 

X  Y^y  Y=^  Y 
{x-)Y 

Table  4  contains  the  defining  equation  NEIl,  and  an  extra  equation  NEI2  which  de¬ 
scribes  the  interplay  of  no-exit  iteration  with  the  empty  process.  The  axiomatization 
Al-9+NEIl,2-hRSPJ'  is  sound  for  BPAj;(A). 


NEIl  X  ■  —  i" 

NEI2  {x  +  ey  =  I*" 

RSPe  x  =  y  ■  X  A  y  Y  x  =  y^ 

Table  4.  The  axioms  for  the  perpetual  loop  in  the  presence  of  £ 


The  purpose  of  this  paper  is  to  present  the  following  three  completeness  results. 

Theoremd.  The  axiomatization  Al-9+NEIl,2-l-RSP£  is  complete  for  BFAs^{A)  with 
respect  to  bisimulation. 

That  is,  if  p  ±>  g  for  process  terms  p  and  q  in  BPA^g.(A),  then  p  =  q  can  be  derived  from 
the  axioms  Al-9-bNEIl,2-^-RSP^ 

Theorems.  The  axiomatization  A 1-7+ NEIl 4- RSP^^  is  complete  for  BF A ^  {A)  with  re¬ 
spect  to  bisimulation. 

Theorem  6.  The  axiomatization  Al-S+NEIl+RSP*^  is  complete  for  B?A‘^  {A)  with  re¬ 
spect  to  bisimulation. 

This  paper  focuses  on  the  completeness  proof  for  BPA5'(A).  The  completeness  proof  for 
BPA‘^(A)  is  closely  related  to  the  one  for  BPAJ’(A)  (missing  only  some  minor  cases  for  5 
in  the  construction  of  basic  terms  in  Lemma  17).  The  completeness  proof  for  BPA5g.(A) 
also  uses  the  same  proof  strategy,  but,  due  to  the  presence  of  the  empty  process,  the 
technical  details  are  considerably  more  complicated.  The  reader  is  referred  to  [12]  for  a 
detailed  exposition  on  the  completeness  proof  for  BPAj£(A). 

3  Proof  of  the  Main  Theorem 

This  section  presents  preliminaries  that  are  needed  in  the  proof  of  Theorem  5,  together 
with  the  completeness  proof  itself.  Many  preliminary  definitions  in  this  section  originate 
from  [11].  For  omitted  proofs  the  reader  is  referred  to  [12]. 
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3.1  Expansions 

From  now  on,  process  terms  in  BPA5'(.4)  are  considered  modulo  associativity  and  com¬ 
mutativity  of  the  +,  that  is,  modulo  the  axioms  Al,2.  We  write  p  =ac  9  if  P  and  q  can 
be  equated  by  axioms  Al,2.  As  usual,  pi  represents  the  term  pi  +  . . .  -bpn,  and  the 
p,  are  called  the  summands  of  this  term.  The  empty  sum  represents  <5,  where  Pi  +  q 
is  not  considered  empty. 

Definition  7.  For  each  process  term  p,  its  collection  of  possible  transitions  is  finite,  say 
{p  Pi  I  2  =  1, ....  n}  U  {p  y  I  ;  =  1, ...,  m}.  The  expansion  of  p  is 

n  m 

Z=1  i=l 

Lemma  8.  Each  process  term  p  in  BPAj  (A)  is  provably  equal  to  its  expansion,  using 
A4-7-KNEI1. 

Proof:  By  structural  induction  with  respect  to  p. 

3-2  Normed  Processes 

The  following  terminology  stems  from  [4]. 

Definition  9.  A  process  term  p  is  called  normed  if  it  can  terminate  in  finitely  many 
transitions,  that  is,  p  pi  Pn  — >  V • 

The  class  of  normed  processes  in  BPA§  (A)  can  be  defined  inductively  as  follows: 

-  a  G  A  is  normed; 

-  if  p  or  5  is  normed,  then  p  -I-  ^  is  normed; 

-  if  p  and  q  are  normed,  then  pq  is  normed. 

Lemma  10.  If  p  is  not  normed,  then  pq  =  p  is  provable  using  A4,5,7+NEI1-I-RSP"'. 
Proof  By  structural  induction  with  respect  to  p. 


3.3  An  Ordering  on  Pairs  of  Terms 

The  following  weight  function  on  process  terms  in  BPAj  (A),  which  represents  the  max¬ 
imum  nesting  of  a;’s  in  a  term,  will  be  used  to  formulate  an  ordering  on  pairs  of  terms. 

g{a)  =  0 

pW  =  0 

g{p  +  q)  ^max{p(p),p(g)} 
g{pq)  =  max{p(p),p(g)} 

9{P^)  =P(P)  +  1- 

Note  that  p-value  is  invariant  under  axioms  Al,2.  The  following  lemma  can  easily  be 
deduced,  using  structural  induction. 

Lemma  11.  If  p’  is  a  derivative  of  p,  then  g{p')  <  g{p)- 

We  consider  pairs  of  process  terms  modulo  commutativity.  The  ordering  <  on  pairs  of 
process  terms  is  defined  as  follows. 
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Definition  12.  The  ordering  <  on  pairs  of  terms  is  obtained  by  taking  the  transitive 
closure  of  the  union  of  the  three  relations  below. 

1.  (r,5)  <  ip,q)  if  g{r)  <  g{p)  and  g{s)  <  gip)', 

2.  (r,s)  <  (p,g)  if  g{r)  <  g(p)  and  g{s)  <  g{q)\ 

3.  {p',q')  <  {p,q)  if  p'  is  a  derivative  of  p,  and  not  vice  versa,  and  q'  is  a  derivative  of  q. 

The  proof  of  the  completeness  theorem  is  based  on  induction  with  respect  to  this  ordering, 
so  we  need  to  know  that  it  is  well-founded. 

Lemma  13.  The  ordering  <  on  pairs  of  process  terms  is  well-founded  modulo  -ac- 
Proof  Omitted. 

3.4  Basic  Terms 

We  construct  a  set  B  of  basic  process  terms,  such  that  each  process  terra  is  provably 
equal  to  a  basic  term,  and  the  derivatives  of  basic  terms  are  basic  terms.  We  will  prove 
the  completeness  theorem  by  showing  that  bisimilar  basic  terms  are  provably  equal. 

Definition  14.  The  set  1  of  basic  process  terms  is  defined  inductively  as  follows: 

1.  if  e  A  and  pi,...,Pn  €  B,  then  XlILi  ^ 

2.  if  p  €  B  then  p“  €  B; 

3.  if  p  €  B  and  p*  is  a  proper  derivative  of  p,  then  G  B. 

For  notational  convenience,  we  distinguish  the  following  set  C  of  cycles  in  B. 

Definition  15.  C  =  {p‘^,p'(p‘^)  |  P  €  B,  p'  proper  derivative  of  p}. 

The  following  facts  for  basic  terms  will  be  needed  in  the  completeness  proof. 

Lemma  16.  1.  //p  E  C  and  p  — >  p' ,  then  p'  €  C. 

2.  //  p  G  B  and  p  p' ,  then  p'  E  B. 

3,  //  p  E  B  and  p  is  a  proper  derivative  of  itself  then  p  E  C. 

Lemma  17.  For  each  termp  there  exists  a  basic  term  q  with  g{q)  <  g{p)  such  thatp  —  q 
is  provable  using  A4-74-NEIl-t-RSP‘*'. 

3.5  The  Auxiliary  Function  (p 

Before  starting  with  the  completeness  proof,  first  we  need  to  develop  some  theory.  The 
proposition  that  will  be  proved  at  the  end  of  this  section  makes  an  important  stepping 
stone  to  obtain  the  desired  completeness  result  for  SPAS' (A). 

p'ip^^)  ±t  p"(p‘^)>  with  p'  and  p"  derivatives  of  p,  does  not  imply  p'  ±t  p''^  example, 
clearly  a((aa)^)  ±±  aa{{aa)^),  but  a  ^  aa.Jn  order  to  solve  this  ambiguity,  we  define  an 
operator  pp  on  basic  terms,  where  intuitively  the  term  (f)p{q),  for  g  ^  C,  is  obtained  from 
the  argument  q  as  follows:  all  proper  derivatives  g'  of  g  with  g'(p‘^)  ±i  p"^  are  removed  in 
(/)p(g).  We  will  see  that  if  p'(p'^)  ±>p"(p'*')  then  0p(pO  ±>  (f>p{p”)- 

Definition  18.  Given  g  E  B,  the  term  dp{q)  is  defined  as  follows,  using  structural  in¬ 
duction.  We  distinguish  two  cases:  either  g  E  C  or  g  0  C. 

-  Case  1:  g  e  C  Then  put 

Pp{q)  =.AC  g. 
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-  Case  2:  q  so  that 

?=AC  + 

i€i  jeJ 

Then  define  /o  =  {i  €  /  |  QiiP^)  tk. 

0pW  =AC  X^ai(pp(gi)+  ^  + 

t€/o  i6/\/o 

Lemma  19.  For  g  eB  we  have  g{(l>p{q))  < 

Proof.  By  structural  induction  with  respect  to  q. 

The  proofs  of  the  next  two  technical  lemmas  are  quite  involved,  and  therefore  omitted. 
Lemma 20.  Assume  that  for  some  natural  number  Nq: 

A.  for  all  terms  u  with  g{u)  <  No  we  have  ^  u. 

Let  q,r  €  B  and  giq  +  r)  <  Nq.  If  qip"^)  ±±r{p^)  ihen 

0p(g)  ±±<^pW- 

Proof  Omitted. 

Lemma 21.  Assume  that  for  some  natural  number  Nq: 

A.  for  all  terms  u  with  g{u)  <  Nq  we  have  p  ^  u; 

B.  for  all  pairs  {u,v)  of  bisimilar  terms  with  g(u  v)  <  No  we  have  u  =  v. 

Let  p,  g  €  B  and  g{p  A-  q)  <  A^o-  Then 

qiMp)"^)  =  4>p{q)i^pip)'^)- 

Proof  Omitted. 

Proposition  22.  Assume  that  for  some  natural  number  Nq: 

A.  for  all  terms  u  with  g{u)  <  Nq  we  have  p'^  u; 

B.  for  all  pairs  (u,  v)  of  bisimilar  terms  with  g(u  A  v)  <  No  we  have  u  —  v. 

Let  gipAqAr)  <  No  and  q{p^)  i±  r{p^).  Then 

q{pn  =  r{pn- 


Proof  By  Lemma  17 

p  =  s  (1) 

with  s  G  B  and  p(s)  <  g{p)  <  Nq.  Since  conditions  A  and  B  hold,  Lemma  21  can  be 
applied  to  derive  s{<ps{s)‘^)  =  (l^s{s){4>s{s)")  =  ^SP"'  then  yields 

=  Msr-  (2) 


According  to  Lemma  17  there  exist  basic  terms  t  and  u  with  g{t)  <  g{q)  <  No  and 
g(ti)  <  g(r)  <  No  and 


Since  tis'^)  ^q{p^)  ^±r{pP)  ±tu(s^),  and  since  g{t  A  u)  <  No  and  requirement  A  of 
Lemma  20  is  satisfied,  it  implies  (fsit)  ±>  4>s{u)-  Since  g{<i>s{t)  +  (}>s{u))  <  No  (Lemma 
19),  condition  B  yields 


q(p-)  i(s“)  ®  4>At){<i>,{sr) 

S  Mu){0,{^r)  ‘''=  «(<?.(«)-)  '=’  r(p“). 


□ 


Hence, 


3.6  Completeness  Proof 


Proof  of  Theorem  5:  Assume  p,q  e  with  p  i±q]  we  show  that  p  =  q  ca.n  be  derived 
from  Al-7+NEIl+RSP‘^,  by  induction  on  the  well-founded  ordering  <  on  pairs  of  terms. 
So  suppose  that  we  have  already  dealt  with  pairs  of  bisimilar  basic  terms  that  are  smaller 
than  (p,  q).  By  symmetry  it  is  sufficient  to  consider  two  cases:  either  p  ^  C  or  €  C. 

-  Case  1:  p  ^  C 

According  to  Lemma  8  p  and  q  are  provably  equal  to  their  expansions.  Since  p  i±  g, 
these  expansions  can  be  adapted,  using  axiom  A3,  to  obtain: 

n  m  n  m 

j=l  i=l  j  =  l 

where  pi  ±i  for  i  =  1, ...,  n.  Since  p  0  C,  Lemma  16.3  says  that  p  is  not  a  derivative 
of  Pi  for  i  =  l,...,n.  Since  the  pi  and  the  qi  for  i  =  l,...,n  are  derivatives  of  p  and 
q  respectively,  it  follows  that  {pi,qi)  <  {p,q)  for  i  =  1,  ...,n  (by  item  3  in  Definition 
12).  So  induction  yields  pi  =  qi  for  i  =  1,  ...,n.  Hence,  p  =  q. 

-  Case  2:  p,q  e  C 

Since  p  €  C,  either  p  =ac  =  r{r^)  or  p  =ac  r'{r^),  where  r  €  1  and  r'  is  a 
proper  derivative  of  r.  In  both  cases  p  =  r'{r^)  with  r  €  B  and  P  a  derivative  (not 
necessarily  proper)  of  r.  Even  so,  q  =  s'{s'^)  with  s  €  B  and  s'  a  derivative  of  s. 

By  symmetry,  it  is  sufficient  to  distinguish  two  cases:  either  r'  is  not  normed,  or  both 
r'  and  s'  are  normed. 

*  Case  2.1:  r'  is  not  normed. 

Then  by  Lemma  10  r'{r^)  =  r'.  Since  g{r')  <  g{r)  <  p(p),  item  2  in  Definition 
12  yields  {r',q)  <  (p,?).  So,  since  r'  ±±r'{r'^)  i±  g,  induction  yields  r'  =  q.  Hence, 
p  =  r'{r^)  =r'  ^q. 

■k  Case  2.2:  Both  r'  and  s'  are  normed. 

For  convenience  of  notation  put  No  =  max{p(p),  p(g)}.  Again,  we  consider  two  cases: 
either  there  exists  or  there  does  not  exist  a  term  t  with  g{t)  <  Nq  and  p  ±>  L 
o  Case  2.2.1:  There  exists  a  term  t  with  g[t)  <  No  and  p  ±t  t  (and  so  q  ±t  i)- 
Since  by  the  assumption  at  case  2.2  r'  is  normed,  and  r'(r‘^)  ±±t,  there  exists  a 
derivative  t'  of  t  with  r"^  ±±  and  so  rt'  j±  t'.  Furthermore,  Lemma  11  implies 
g(t')  <  g{t)  <  No,  and  so  g{rt'  1')  <  No-  So  after  using  Lemma  17  to  reduce  rt' 
and  t'  to  basic  form,  we  can  apply  induction,  by  item  1  in  Definition  12,  to  conclude 
rt'  =  t'.  RSP‘^  then  yields  r"^  =  t' ,  so  p  =  r't'.  By  Lemma  17  r't'  =  u  with  u  E  B  and 
g(u)  <  Nq.  Thus,  p  =  u.  Even  so,  q  =  v  for  some  basic  term  v  with  g{v)  <  Nq.  Then 
u  ±±p  ±^q  ±¥v,so  since  g{u-^v)  <  No,  induction  yields  u  =  v.  Hence,  p  =  u  =  v  =  q. 
o  Case  2.2.2:  For  each  term  t,  if  g{t)  <  No  then  p  ^  t  (and  so  q  t). 

Since  p  ±tq,  the  assumption  of  this  case  implies  g(p)  =  g{q). 

Note  that  the  requirements  A  and  B  for  Proposition  22  are  satisfied,  by  the  assump¬ 
tion  at  case  2.2.2  together  with  the  induction  hypothesis  (item  1  of  Definition  12). 
So  we  are  allowed  to  apply  Proposition  22  in  this  case. 

By  the  assumption  at  case  2.2  r'  is  normed,  so  since  r'(r‘^)  ±is'(s‘*'),  there  exists  a 
derivative  s"  of  s  such  that  r’^  Even  so,  ±±  r"{r'^)  for  some  derivative 

r"  of  r  such  that  r"(r‘^)  ±±  s‘^. 

Since  s"r"{r'"')  t>s"(s‘^)  ft  r‘^  ±i7'(r^),  andp(s'V"  +  r)  <  A^o,  Proposition  22  yields 
s"r"(r‘^)  =  r(r'^)  RSP^^  then  yields 
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Even  so, 


s  =  [r  s  )  . 

Since  s"{{r"s'T)  ■''=^  s''((r"s")((r"s")"))  =  (s'V")(s''((r"s")“)),  R5P“  yields 

5''((r''s")"')  =  (s"r")“. 


(7) 

(8) 


Since  r's"(s“)  i±  r's"({r"s")“)  i±  r'((s''r")“)  ±±  r'(r“) 
No,  Proposition  22  yields 


±ts'(s‘^),  and  -f  5')  < 

(9) 


So  finally, 

P  /(,.-)  (J:>  r'{(sV')“)  '=’  r's'-CCr'-s")")  '=  r's"(s“)  *=’  s'(s")  =ac  9-  D 


3.7  An  Example 

We  give  an  example  as  to  how  the  construction  in  the  completeness  proof  acts  on  par¬ 
ticular  pairs  of  bisimilar  basic  terms. 

Example  1.  {a5  -H  6)((c(<i^  +  ^>))‘^)  ±t  +  bc)"^ . 

This  equivalence  belongs  with  case  2.2.2.  It  can  be  derived  as  follows. 

{aS  +  b)((c(aS  +  6))“)  ‘  (aS  +  b)((c(aS  +  b))((c(aS  +  6))")) 

((aS  +  b)c)((aS  +  b)((c(ab  +  6))“)). 


Then  RSP"  yields 

(ad  +  b)((c(aS  +  b)r)  =  ((aS  +  6)c)“'. 


So  finally, 

(ab  +  b)((c(ab  +  b))‘^)  '=  ((ad  +  6)cr  =  (atS  +  fec)". 


(10) 


4  Conclusion 

In  this  paper,  Milner’s  axiomatization  for  iteration  was  restricted  to  the  case  of  no¬ 
exit  iteration,  and  it  was  proved  that  this  yields  a  complete  axiomatization  for  no-exit 
iteration  in  process  algebra  modulo  bisimulation.  The  main  new  idea  in  the  proof  was  to 
introduce  a  function  <f>  which  can  help  to  minimize  the  airgument  p  of  a  no-exit  iteration 
term  ,  in  such  a  way  that  p  does  not  contain  any  proper  derivatives  p'  with  p'(p^)  ±± 

For  example,  using  this  function  0,  the  term  (aa)‘^  can  be  reduced  to  a^. 

The  completeness  result  in  this  paper  may  be  a  step  forward  to  a  positive  answer  to 
the  question  whether  Milner’s  axiomatization  is  complete  for  iteration  in  process  algebra 
modulo  bisimulation.  Namely,  the  main  problem  in  solving  this  question  is  to  deal  with 
no-exit  iteration  terms  p^  where  p  is  not  minimal.  Unfortunately,  it  is  not  obvious  how 
to  extend  the  definition  of  the  function  (/>  to  all  terms  in  process  algebra  with  iteration. 
For  example,  consider  the  term 

(a((a(ba  +  a))*c))‘^ 

where  the  argument  a((a(6a -1- a))*c)  of  no-exit  iteration  is  not  minimal.  Minimization  of 
this  argument  would  yield  a  so-called  ‘double-exit’  term  (with  exits  b  and  c),  which  cannot 
be  expressed  in  process  algebra  with  iteration  modulo  bisimulation  (see  [6,  1]).  The  only 
way  to  obtain  a  no-exit  iteration  term  with  a  minimal  argument  in  this  particular  case 
is  to  rewrite  the  term  to 

a((a(ba  -h  a)  +  ca)"^) 

A  minimization  strategy  for  all  possible  arguments  of  no-exit  iteration  would  probably 
be  the  key  to  solving  Milner’s  question. 
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Abstract  Rectangular  hybrid  automata  model  digital  control  programs  of  analog 
plant  environments.  We  study  rectangular  hybrid  automata  where  the  plant  state  evolves 
continuously  in  real-numbered  time,  and  the  controller  samples  the  plant  state  and 
changes  the  control  state  discretely,  only  at  the  integer  points  in  time.  We  prove  that 
rectangular  hybrid  automata  have  finite  bisimilarity  quotients  when  all  control  transi¬ 
tions  happen  at  integer  times,  even  if  the  constraints  on  the  derivatives  of  the  variables 
vary  between  control  states.  This  is  sharply  in  contrast  with  the  conventional  model 
where  control  transitions  may  happen  at  any  real  time,  and  already  the  reachability 
problem  is  undecidable.  Based  on  the  finite  bisimilarity  quotients,  we  give  an  exponen¬ 
tial  algorithm  for  the  symbolic  sampling-controller  synthesis  of  rectangular  automata. 

We  show  our  algorithm  to  be  optimal  by  proving  the  problem  to  be  EXPTlME-hard. 

We  also  show  that  rectangular  automata  form  a  maximal  class  of  systems  for  which 
the  sampling-controller  synthesis  problem  can  be  solved  algorithmically. 

1  Introduction 

Hybrid  systems  are  dynamical  systems  with  both  discrete  and  continuous  components.  A 
paradigmatic  example  of  a  hybrid  system  is  a  digital  control  program  for  an  analog  plant  en¬ 
vironment,  like  a  furnace  or  an  airplane:  the  controller  state  moves  discretely  between  control 
modes,  and  in  each  control  mode,  the  plant  state  evolves  continuously  according  to  physical 
laws.  A  natural  mathematical  model  for  hybrid  systems  is  the  hybrid  automaton,  which  rep¬ 
resents  discrete  components  using  finite-state  machines  and  continuous  components  using 
real-numbered  variables  [ACH+95].  A  particularly  important  subclass  of  hybrid  automata 
are  the  rectangular  automata,  where  in  each  control  mode  v,  the  given  n  variables  follow  a 
nondeterministic  differential  equation  of  the  form  ^  €  B[v),  for  an  n-dimensional  rect¬ 
angle  B[v)  C  [HKPV95].  Rectangular  automata  are  useful  as  (1)  they  can  be  made  to 
approximate,  arbitrarily  closely,  complex  continuous  behavior  using  lower  and  upper  bounds 
on  derivatives  [HH95],  and  (2)  they  can  be  analyzed  automatically  using  (semi)algorithms 
based  on  symbolic  execution,  such  as  those  implemented  in  HyTech  [HHW97]. 

For  systems  that  can  be  executed  symbolically,  verification  and  control  yield  to  a 
(semi)algorithmic  approach  even  if  the  state  space  is  infinite  [Hen96].  For  such  systems,  a 
temporal  formula  can  be  verified  automatically  and  a  controller  can  be  synthesized  auto¬ 
matically  by  computing,  using  iterative  approximation,  a  fixpoint  of  an  operator  on  state 
sets  [BCM+92,  MPS95].  The  fixpoint  computation  is  guaranteed  to  terminate  in  the  pres¬ 
ence  of  a  suitable  finite  quotient  space.  For  example,  symbolically-executable  systems  with 
finite  bisimilarity  quotients  allow  symbolic  LTL  and  CTL  model  checking,  and  symbolic 

*  This  research  was  supported  in  part  by  the  ONR  YIP  award  NOOO 14-95- 1-0520,  by  the  NSF 
CAREER  award  CCR-9501708,  by  the  NSF  grant  CCR-9504469,  by  the  AFOSR  contract  F49620- 
93-1-0056,  by  the  ARO  MURI  contract  DAAH-04-96- 1-0341,  by  the  ARO  contract  DAAL03-91- 
C-0027  through  the  MSI  at  Cornell  University,  by  the  ARPA  grant  NAG2-892,  and  by  the  SRC 
contract  95-DC-324.036. 
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safety  controller  synthesis.  While  rectangular  automata  can  be  executed  symbolically,  they 
do  not  necessarily  have  finite  bisimilarity  quotients,  and  simple  reachability  questions  are 
undecidable  [HKPV95].  A  noted  subclass  of  rectangular  automata  with  finite  bisimilarity 
quotients  are  timed  automata,  where  all  variables  are  clocks  with  derivative  1  [AD94].  As 
a  consequence,  the  symbolic  model  checking  and  controller  synthesis  problems  have  been 
solved  for  timed  automata  [HNSY94,  MPS95]. 

While  previous  results  on  timed  and  hybrid  automata  allow  edge  transitions  (i.e.,  control 
switches)  to  occur  at  any  real-numbered  points  in  time,  this  is  not  necessarily  a  natural 
assumption  for  controller  synthesis,  as  it  permits  controllers  that,  in  a  single  time  unit,  can 
interact  with  the  plant  an  unbounded  number  of  times  (even  infinitely  often,  if  no  special 
care  is  taken  [AH97]).  By  contrast,  we  study  the  control  problem  under  the  assumption  that 
while  the  plant  evolves  continuously,  the  controller  samples  the  plant  state  discretely,  at 
the  integer  points  in  time  only.^  This  leads  to  the  following  formulation  of  the  sampling- 
controller  synthesis  problem  for  rectangular  automata:  given  a  continuous-time  rectangular 
automaton,  is  there  a  discrete-time  controller  that  samples  the  automaton  state  at  integer 
times  and  switches  the  control  mode  accordingly  so  that  the  resulting  closed-loop  system 
satisfies  a  given  invariant? 

To  solve  this  problem,  we  study  the  discrete-time  transition  systems  of  timed  and  rect¬ 
angular  automata,  where  all  time  transitions  have  unit  duration.  It  should  be  noticed  that  all 
variables  still  evolve  continuously,  in  real-numbered  time;  only  edge  transitions  are  restricted 
to  discrete  time.  We  prove  that  unlike  in  the  case  of  dense  time,  the  discrete-time  transition 
system  of  every  rectangular  automaton  has  a  finite  bisimilarity  quotient.'^  As  a  corollary,  we 
conclude  that  the  standard  approaches  to  symbolic  model  checking  and  controller  synthesis 
are  guaranteed  to  terminate  when  all  control  switches  must  occur  at  integer  times.  The  run¬ 
ning  times  of  the  verification  and  control  algorithms  depend  on  the  number  of  bisimilarity 
equivalence  classes,  which,  while  exponential  in  the  description  of  the  automaton,  is  less 
by  a  multiplicative  exponential  factor  than  the  number  of  region  equivalence  classes  used 
for  the  dense-time  verification  and  control  of  timed  automata.  Thus,  the  often  more  realistic 
sampling-controller  synthesis  problem  can  be  solved  for  a  wider  class  of  hybrid  systems 
than  dense-time  control  (rectangular  vs.  timed),  at  a  smaller  cost. 

We  prove  that  our  sampling-control  algorithm  is  optimal,  by  giving  lower  bounds  on 
the  control  problem  for  timed  and  hybrid  systems:  we  show  that  the  safety  control  decision 
problem  (does  there  exist  a  controller  that  maintains  an  invariant?)  is  complete  for  EXPTIME 
already  in  the  restricted  case  of  discrete-time  timed  automata.  We  also  identify  the  boundary 
of  sampling  controllability  by  proving  that  several  generalizations  of  rectangular  automata 
lead  to  an  undecidable  reachability  problem,  even  in  discrete  time.  The  undecidability  of 
dense-time  reachability  for  rectangular  automata  has  led  [PV94]  to  consider  the  restriction 
that  the  flow  rectangle  B{v)  must  be  the  same  for  each  control  mode  v.  For  the  resulting 
class  of  initialized  vcciangu\ar  automata,  reachability  is  decidable  [HKPV95].  Our  work  can 
be  viewed  as  pointing  out  an  orthogonal  restriction  of  rectangularity,  namely,  that  the  flow 
rectangle  may  change  only  at  integer  points  in  time.  Unlike  initialization,  our  restriction 
guarantees  not  only  a  finite  language  equivalence  quotient  but  a  finite  bisimilarity  quotient 
on  the  infinite  state  space  of  a  rectangular  automaton. 

^  The  sampling  rate  of  the  controller  may  be  any  rational,  but  without  loss  of  generality  we  assume  it 
to  be  1 . 

Under  the  technical  restriction  that  either  the  invariant  and  flow  rectangles  are  positive,  or  the 
automaton  state  stays  within  a  bounded  region. 
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2  Definitions  and  Previous  Results 
2.1  Labeled  Transition  Systems 

Definition  2.1  [Transition  system]  A  transition  system  5  =  (Q,  -)• ,  Q/,  |=)  consists 

of  a  set  Q  of  states,  a  finite  set  17  of  events,  a  multiset  c  Q  x  x  Q  called  the 
transition  relation,  a  set  Q/  C  Q  of  initial  states,  a  set  77  of  propositions,  and  a  satisfaction 
relation  )=  c  Q  x  77.  We  write  instead  of  (g,  cr,q')  E  -4- ,  and  g  |=  ;r  instead  of 

7r)  g  1=.  The  transition  system  S  is  finite  if  Q  is  finite.  We  assume  for  simplicity  that 
S  \s  deadlock-free;  that  is,  for  each  state  g  G  Q,  there  exists  an  event  tr  6  i7  and  a  state 
r  G  <3  such  that  g  4  r.  A  region  is  a  subset  of  Q.  Given  a  proposition  n  £  IT,  write 
Rt,  =  {q  £  Q  \  q  tt}  for  the  region  of  states  that  satisfy  tt.  ■ 

Verification  as  reachability 

Definition  2.2  [Weakest  precondition]  Let  5  be  a  transition  system.  For  each  event  cr  £  E, 
the  (7 -predecessor  operator  Pre<j  :  2^  2^  is  defined  by  Freer  (R)  =  {g  G  Q  I  3r  G 

R^qf^  r).  In  particular,  Pre^{Q)  is  the  set  of  states  in  which  the  event  a  is  enabled.  Define 
Pre  :  2^  2^  by  Pre{R)  =  \Jaes  ^  region  7?  C  Q  is  reachable  in  S  if 

Qi  n  Pre^[R)  7^  0  for  some  k  £N.m 

The  basic  verification  problem  for  transition  systems  asks  whether  an  unsafe  state  is  un¬ 
reachable. 

Definition  23  [Safety  verification]  Let  C  be  a  class  of  transition  systems.  The  safety  verifi¬ 
cation  problem  for  C  is  stated  in  the  following  way:  given  a  transition  system  5  G  C  and  a 
proposition  tt  £  IT,  determine  whether  the  region  R^  is  not  reachable  in  5.  ■ 

For  finite  transition  systems,  the  safety  verification  problem  is  the  complement  of  graph 
reachability,  which  can  be  solved  in  linear  time  and  is  complete  for  NLOGSPACE.  The 
safety  verification  problem  can  be  generalized  to  the  safety  control  problem. 

Control  as  alternating  reachability  We  use  the  following  model  for  control:  for  each 
state  g  of  a  transition  system,  a  (memory-free)  controller  chooses  an  enabled  event  cr  so 
that  in  state  g,  the  controlled  system  always  proceeds  via  event  cr .  Since  g  may  have  several 
(j-successors,  the  controlled  system  may  still  be  nondetermini  Stic,  Alternative  models  for 
memory-free  control  are  equivalent. 

Definition  2.4  [Control  map]  Let  5  be  a  transition  system.  A  control  map  for  5  is^a  function 
K  :  Q  E  such  that  for  each  state  q  £  Q,  there  exists  a  state  r  £  Q  with  g  J*-  The 
closed-loop  system  k{S)  is  the  transition  system  (Q,  E,  ,Qi,  n,\=),  where  g  ^  g'  iff 
q-^q'  and  /c(g)  =  <t.  ■ 

The  basic  control  problem  for  transition  systems  asks  whether  an  unsafe  state  is  avoidable 
by  applying  some  control  map. 

Definition  2.5  [Safety  control]  Let  C  be  a  class  of  transition  systems.  The  safety  control 
decision  problem  for  C  is  stated  in  the  following  way:  given  a  transition  system  S  G  C  and  a 
proposition  tt  G  77,  determine  whether  there  exists  a  control  map  k  such  that  the  region  R^r 
is  not  reachable  in  the  closed-loop  system  k{S).  If  so,  then  we  say  tt  is  avoidable  in  S.  The 
safety  controller  synthesis  problem  requires  the  construction  of  a  witnessing  control  map  /c 
when  TT  is  avoidable.  ■ 
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For  finite  transition  systems,  the  safety  control  decision  problem  is  the  complement  of 
AND-OR  graph  reachability,  which  can  be  solved  in  quadratic  time  and  is  complete  for 
PTIME. 

Definition  2.6  [Alternating  reachability]  An  AND-OR  graph  G  =  (Va  ,  Vb,  V7,  consists 
of  a  finite  set  V"  =  U  Vb  of  vertices  that  is  partitioned  into  a  set  Va  of  AND  vertices  and  a 
set  Vo  of  OR  vertices,  a  set  V/  C  V  of  initial  vertices,  and  a  multiset  ->  C  ^  x  V"  of  edges. 
We  assume  deadlock  freedom,  namely,  that  for  each  vertex  v  E  V,  there  exists  a  vertex 
w  EV  such  that  v  -¥  w.  The  controllable  predecessor  operator  CPre  \2^  -^2^  is  defined 
by  CPre{R)  -  {q  EVq  \  ^r  E  R.q  r}\J  {q  EVA\^r  EV.q  r  implies  r  G  R}.  A 
set  Rc  V  of  vertices  is  alternating  reachable  in  G  if  V/  fl  CPre^  (i?)  ^  0  for  some  k  Ef^. 
The  alternating  reachability  problem  asks  whether  a  given  set  of  vertices  is  alternating 
reachable  in  a  given  AND-OR  graph.  ■ 

Theorem  2.1  [Imm8 1]  The  alternating  reachability  problem  is  complete  for  PTIME. 

There  is  a  simple  correspondence  between  safety  control  and  alternating  reachability.  Let 
5  be  a  finite  transition  system  and  let  tt  be  a  proposition.  Define  an  AND-OR  graph  Gs  ^ 
follows:  let  Va  =  Q  and  Vb  =  Q  x  Z*  and  Vj  =  Q/;  for  each  vertex  q  EVa  and  each  event 
(T  G  X*,  let  g  ^  (g,  cr)  in  Gs  iff  ?  G  Pre^iQ)  in  5;  and  for  each  vertex  [q,  a)  E  Vb,  let 
{q,o-)  ->  r  in  Gs  iff  g  A  r  in  5.  Then  the  proposition  tt  is  avoidable  in  S  iff  the  set  R^r  of 
AND  vertices  is  not  alternating  reachable  in  Gs. 

Corollary  2.1  The  safety  control  decision  problem  for  finite  transition  systems  is  complete 
for  PTIME. 

Moreover,  a  byproduct  of  a  negative  alternating  reachability  computation  is  a  control  map  that 
avoids  TT.  Note  that  for  each  set  ii  C  Q  of  AND  vertices,  CPre^{R)  =  ^ 

{Q  \  PrcaiQ))).  Thus  the  region  CPre^iR)  is  the  set  of  all  states  that  no  control  map 
can  keep  out  of  R  at  the  next  transition.  Let  Rf  —  CPre^^^^{Rir).  Then  tt  is  avoidable 
in  S  iff  Q I  n  Rf  =  0.  Each  application  of  CPre^  can  be  computed  in  linear  time,  so 
Rf  can  be  computed  in  quadratic  time.  If  tt  is  indeed  avoidable,  then  a  witnessing  control 
map  may  be  constructed  by  choosing  for  each  state  q  E  Q  \  Rf  stn  event  a  such  that 
q  E  PreaiQ)  \  Pr€cr{RF)- 

Theorem  2.2  [RW87]  The  safety  controller  synthesis  problem  for  finite  transition  systems 
can  be  solved  in  quadratic  time. 

Effectively-presented  transition  systems  with  finite  bisimilarity  quotients  The  safety 
controller  synthesis  problem  can  be  solved  not  only  for  finite  transition  systems,  but  also  for 
effectively-presented  transition  systems  with  finite  bisimilarity  quotients. 

Definition  2.7  [Effective  presentation]  A  symbolic  execution  theory  for  the  transition  system 
S  consists  of  a  set  P  of  formulas,  a  formula  b/  G  P,  and  a  map  {*1:7^—^  2^  such  that 
(1)  every  proposition  tt  G  is  a  formula:  |7r]|  =  Rt:\  (2)  for  all  formulas  bi,  bi  G  P,  the 
three  expressions  bi  A  b2  and  bi  V  b2  and  -ibi  are  formulas:  [bi  A  b2l  =  [<?i]  [^2]  and 

[bi  V  02I  =  [bi]U[b2land[^bi]l  =  Q\[bi];(3)  [b/l  =  Q/;(4)  thesetjo  G  7^  |  [b]  ~  0} 
is  recursive;  and  (5)  for  each  event  cr  G  27,  there  is  a  computable  map  Pre^  \  P  P  such 
that  [Prea(<p)\  ~  Precr(l<j)\)  for  all  formulas  b  G  7^.  An  effectively-presented  transition 
system  consists  of  a  transition  system  5  together  with  a  symbolic  execution  theory  for  S.  ■ 
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Definition  2.8  [Bisimilarity]  A  bisimulation  on  the  transition  system  S  is  an  equivalence 
relation  =  on  the  state  set  Q  such  that  (1)  if  g  =  r  then  for  all  propositions  tt  G  i7,  we 
have  g  1=  TT  iff  r  (=  TT,  and  (2)  if  g  =  r  and  g  A  q\  then  there  exists  a  state  G  Q  such 
that  r  Ar'  and  q'  =  r‘ .  The  largest  bisimulation  on  S  is  denoted  by  =.  The  bisimilarity 
quotient  SI  =  is  the  transition  system  (0/=,  ->3  ,  Qa,  t=3)»  where  -^3  ^  iff  there 

exist  two  states  g  G  and  g'  G  R'  such  that  g  A  g',  where  i?  G  Qa  iff  ^  H  Q/  7^  0,  and 
where  (=3  tt  iff  0  0.  ■ 

The  controllable-predecessor  operator  CPre^  can  be  computed  on  any  effectively-presented 
transition  system.  When  the  bisimilarity  quotient  has  /;  G  N  equivalence  classes,  the  Rp 
computation  converges  in  at  most  k  iterations  of  CPre^.  Synthesizing  a  control  map  is 
accomplished  by  first  computing  the  bisimilarity  quotient,  and  then  choosing  for  each  state 
in  each  equivalence  class  R  disjoint  from  Rp,  an  event  (t  G  X"  such  that  R  O  Pre(j  (Q)  ^  0 
and  Rf]  Pre^iRp)  —  0- 

Theorem  23  [Hen95]  The  safety  control  decision  problem  is  decidable  for  ejfectively- 
presented  transition  systems  with  finite  bisimilarity  quotients.  Moreover,  when  a  proposition 
is  avoidable,  a  witnessing  control  map  can  be  computed. 

This  result  can  be  generalized  to  liveness  verification  such  as  //-calculus  model  checking,  and 
to  memory-free  liveness  control  such  as  control-map  synthesis  for  Rabin  chain  conditions. 

2.2  Rectangular  Hybrid  Automata 

Definition 2.9  [Rectangle]  Let  X  =  {xi, . . . ,  iCn}  be  a  set  of  real-valued  variables.  A 
rectangular  inequality  over  X  is  a  formula  of  the  form  Xi  ~  c,  where  c  is  an  integer 
constant,  and  --  is  one  of  <,  <,  >,  >.  A  rectangular  predicate  over  X  is  a  conjunction  of 
rectangular  inequalities.  The  rectangular  predicate  <f)  defines  the  set  of  vectors  J<^J  —  {y  G 
I  0[X  :=  y]  is  true}.  A  set  of  the  form  !<?!>],  where  is  a  rectangular  predicate,  is  called 
a  rectangle.  Given  a  positive  integer  m  G  N>o,  the  rectangular  predicate  <p  and  the  rectangle 
m  are  m-definable  if  \c\  <  m  for  every  conjunct  x,-  -  c  of  The  set  of  all  rectangular 
predicates  over  X  is  denoted  Rect{X).  ■ 

Definition  2.10  [Rectangular  automaton]  [HKPV95]  A  rectangular  automaton  A  consists 
of  the  following  components: 

Variables.  A  finite  set  X  =  {xi , . . . ,  Xn}  of  real-valued  variables  representing  the  contin¬ 
uous  component  of  the  system.  The  number  n  is  the  dimension  of  A.  We  write  X  for 
the  set  {xi  |  x,-  G  X)  of  dotted  variables,  and  X'  for  the  set  {xj  |  x,-  G  X}  of  primed 
variables. 

Control  graph.  A  finite  directed  multigraph  (V",  E)  representing  the  discrete  component  of 
the  system.  The  vertices  in  V  are  called  control  modes.  The  edges  in  E  are  called  control 
switches. 

Invariant  conditions.  A  function  inv  :  V  Rect{X)  mapping  each  control  mode  to  its 
invariant  condition,  a  rectangular  predicate. 

Initial  conditions.  A  function  inif.  V  — >  Rect[X]  mapping  each  control  mode  to  its  initial 
condition,  a  rectangular  predicate. 

Jump  conditions.  A  function  Jump  mapping  each  control  switch  e  G  to  a  predicate 
Jump[e)  of  the  form  <i>  A  (p'  A  Ai^update(e)i^i  =  ^0.  where  <P  G  Rect{X)  and  G 
Rect(X')  are  rectangular  predicates,  and  update(e)  C  {1, . . n).  The  jump  condition 
Junip(e)  specifies  the  effect  of  the  change  in  control  mode  on  the  values  of  the  variables: 
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each  unprimed  variable  Xi  refers  to  a  value  before  the  control  switch  e,  and  each  primed 
variable  xj  refers  to  the  corresponding  value  after  the  control  switch. 

Flow  conditions.  A  function  flow :  V  — y  Rect[X)  mapping  each  control  mode  v  to  a  flow 
condition,  a  rectangular  predicate  that  constrains  the  behavior  of  the  first  derivatives  of 
the  variables  while  time  passes  in  control  mode  v. 

Events.  A  finite  set  T  of  events,  and  a  function  event :  E  ^  U  mapping  each  control  switch 
to  an  event. 

Thus  a  rectangular  automaton  A  is  a  tuple  {X,V,  E,inv,init Jump, flow,  E,  event).  The 
automaton  A  is  m-definable  if  every  rectangular  predicate  in  the  definition  of  A  is  m- 
definable.  The  automaton  A  is  positive  if  for  every  control  mode  v  e  V,  the  invariant 
rectangle  [mv(t;)]  and  the  flow  rectangle  \flow[v)'\  are  subsets  of  the  positive  orthant  MJq. 
The  automaton  A  is  bounded  if  for  every  control  mode  v  G  V^,  the  invariant  rectangle 
[mv(u)l  is  a  bounded  set.  ■ 

The  state  of  a  rectangular  automaton  has  two  parts:  a  discrete  (or  control)  part,  and  a 
continuous  (or  plant)  part.  The  discrete  state  is  a  control  mode.  The  continuous  state  is  a 
valuation  for  the  variables. 

Definition  2.11  [States  of  rectangular  automata]  Let  A  be  a  rectangular  automaton.  A  state 
of  A  is  a  pair  (v,  y),  where  G  K  is  a  control  mode  and  y  G  P«v(?;)]  is  a  vector  satisfying  the 
invariant  condition  of  v.  Thus  the  set  of  states  is  Q  =  {(v,  y)  G  x  S”  j  y  G  A 

subset  of  Q  is  called  a  region  of  A.  A  rectangular  state  predicate  for  A  is  a  function  ^  from 
V  to  Rect{X).  The  rectangular  state  predicate  ip  defines  the  region  iP}  =  {{v,y)  e  Q  \  y  G 
[V;(t;)l}.  A  region  of  the  form  {ip],  where  ip  is  sl  rectangular  state  predicate  for  A,  is  called 
a  rectangular  region.  The  initial  condition  map  defines  the  rectangular  region  Qj  =  {initj 
of  initial  states.  ■ 

A  rectangular  automaton  makes  two  types  of  transitions:  jump  (or  edge,  or  control)  transi¬ 
tions,  and  flow  (or  time,  or  plant)  transitions.  Jump  transitions  are  instantaneous.  They  are 
characterized  by  a  change  in  control  mode,  and  are  accompanied  by  discrete  modifications 
to  the  variables  in  accordance  with  the  jump  condition  of  the  control  switch.  During  flow 
transitions,  while  time  elapses,  the  control  mode  remains  fixed  and  the  variables  evolve 
continuously  via  a  trajectory  that  satisfies  the  flow  condition  of  the  active  control  mode. 

Definition  2.12  [Transitions  of  rectangular  automata]  Let  A  be  a  rectangular  automaton. 
For  each  event  a  e  define  the  jump  relation  C  by  (v,y)  (^^^y')  iff  there 

exists  a  control  switch  e  =  (v,  E)  G  E  such  that  event[e)  =  errand  {y.y‘)  G  Ijump^e)}.  For 
each  nonnegative  real  S  G  ]R>o,  we  define  the^ow  relation  C  Q'  by  (u,  y)  -)>  (v',  y') 
iff  (1)  V  =  and  (2)  there  exists  a  differentiable  function  /  :  [0,d]  such 

that  /(O)  =  y  and  f(5)  =  /,  and  /(e)  G  \flow[v)l  for  all  reals  e  G  (0,  J),  where  / 
is  the  first  derivative  of  /.  We  say  that  S  is  the  duration  of  the  flow  transition.  Since  the 
rectangle  [mv(^;)]  is  a  convex  set,  it  follows  that  for  (5  >  0,  condition  (2)  is  equivalent  to 
^  G  lflow{v)f,  that  is,  all  flows  can  be  thought  of  as  straight  lines.  ■ 

Every  rectangular  automaton  defines  two  transition  systems. 

Definition  2.13  [Discrete  time  and  dense  time]  Let  A  be  a  rectangular  automaton.  Define 
the  binary  relation  C  by  {v,y)‘^-^  {v' ,y')  iff  y)  — >■  (^^^ /)  for  some  duration 
6  G  Define  77  to  be  the  set  of  rectangular  state  predicates  for  A,  and  for  all  states 
(t;,y)  G  Q,  define  (i;,y)  1=  tt  iff  (u,y)  G  [tt].  The  discrete-time  transition  system  of  A 
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is  defined  by  Sf^  =  (Q,  T  U  { i},  ,  Q/,  /7,  |=).  The  dense-time  transition  system  of 

A  is  defined  by  =  (Q,  T  U  {time},  ,  Qi,  H,  [=).  Thus  all  flow  transitions  in  the 

discrete-time  transition  system  are  required  to  have  duration  1,  while  flow  transitions  in 
the  dense-time  transition  system  can  have  any  nonnegative  real  duration.  We  refer  to  the 
safety  verification  problem  for  transition  systems  of  the  form  (resp.  for  some 

rectangular  automaton  A,  as  the  discrete-time  (resp.  dense-time)  safety  verification  problem 
for  rectangular  automata,  and  similarly  for  the  control  decision  and  controller  synthesis 
problems.  ■ 

Dense-time  undecidability  results  In  dense  time,  the  verification  and  control  of  rectan¬ 
gular  automata  cannot  be  fully  automated. 

Theorem  2.4  [ACH+95]  For  positive  and  bounded  rectangular  automata,  the  dense-time 
safety  verification  problem  (and  thus  the  dense-time  safety  control  decision  problem)  is 
undecidable. 

Research  has  therefore  concentrated  on  subclasses  of  rectangular  automata.  In  [HKPV95]  it 
is  shown  that  for  initialized  rectangular  automata,  whose  flow  condition  map  is  a  constant 
function  (i.e.,  all  control  modes  have  the  same  flow  condition),  the  dense-time  safety  veri¬ 
fication  problem  (in  fact,  LTL  model  checking)  can  be  decided.  These  automata,  however, 
have  no  finite  bisimilarity  quotients  in  dense  time  [Hen95],  and  therefore  further  restrictions 
are  desirable. 

Timed  automata  An  important  special  case  of  initialized  rectangular  automata  arc  timed 
automata.  All  variables  of  a  timed  automaton  are  clocks,  which  advance  uniformly  at  rate  1 
while  time  elapses. 

Definition  2.14  [Timed  automaton] [AD94]  A  timed  automaton  is  a  positive  rectangular 
automaton  A  with  the  restriction  that flow(v)  =  i  (i;,-  =  1 )  for  every  control  mode  v.  A 

triangular  inequality  over  a  set  X  of  variables  is  a  formula  of  the  form  Xj  —  xj  c,  where 
Xi,  Xj  E  A'  are  variables,  c  is  an  integer  constant,  and  ~  is  one  of  <,<,>,>•  A  triangular 
predicate  over  X  is  a  conjunction  of  rectangular  and  triangular  inequalities.  A  triangular 
state  predicate  for  a  timed  automaton  A  is  a  function  that  maps  every  control  mode  of  A  to 
a  triangular  predicate  over  the  variables  of  A.  ■ 

The  fundamental  theorem  for  timed  automata  states  that  the  dense-time  transition  system 
S^ense  ^  tiiucd  automaton  A  has  a  finite  bisimilarity  quotient  and  can  be  presented 
effectively  using  triangular  state  predicates. 

Theorem  2.5  [AD94,  HNSY94]  For  every  m-definable  n-dimensional  timed  automaton  A 
withk  control  modes,  the  dense-time  transition  system  has  a  finite  bisimilarity  quotient 

with  0(/;  •  (n  -H  1) !  •  (2m)”)  many  equivalence  classes.  Moreover,  the  boolean  combinations 
of  triangular  state  predicates  for  A  form  a  symbolic  execution  theory  for 

Corollarj'  2.2  For  timed  automata,  the  dense-time  safety  verification  problem  (in  fact,  LTL 
and  CTL  model  checking)  can  be  solved  in  PSPACE,  and  the  dense-time  safety  controller 
synthesis  problem  can  be  solved  in  EXPTIME. 

As  for  finite  transition  systems,  control  is  harder  than  verification.  In  [AD94]  it  is  shown  that 
the  dense-time  safety  verification  problem  for  timed  automata  is  hard  for  PSPACE.  From 
Theorem  3.2  below  it  follows  that  the  dense-time  safety  control  decision  problem  for  timed 
automata  is  hard  for  EXPTIME. 
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3  Discrete-Time  Rectangular  Automata 

3.1  Finite  Bisimilarity  Quotients  and  Effective  Presentation 

We  show  that  the  discrete-time  transition  system  of  a  positive  or  bounded  rectangular 
automaton  A  has  a  finite  bisimilarity  quotient  and  can  be  presented  effectively  using  rectan¬ 
gular  state  predicates.  More  precisely,  in  discrete  time,  two  states  of  a  rectangular  automaton 
are  bisimilar  if  (1)  they  have  the  same  control  mode,  (2)  corresponding  variable  values 
agree  on  their  integer  parts,  and  (3)  corresponding  variable  values  agree  on  whether  they 
are  integral.  Moreover,  if  an  m-definable  rectangular  automaton  is  positive,  then  it  cannot 
distinguish  variable  values  greater  than  m.  For  m-definable  bounded  rectangular  automata, 
the  continuous  part  of  the  state  is  contained  in  the  cube  [— m,  m]”.  It  follows  that  in  both  the 
positive  and  the  bounded  case,  the  bisimilarity  quotient  is  finite. 

Definition  3.1  Define  the  equivalence  relation  on  M”  by  y  z  iff  [t/ij  =  [zi\  and 
Tytl  =  \^i]  for  all  I  <i  <  n.  Given  m  €  N>o,  define  the  equivalence  relation  on 
by  y  z  iff  for  each  1  <  i  <  n,  either  y,-  «i  z,-,  or  both  y,-  and  z,-  are  greater  than  m, 
or  both  y,-  and  z,-  are  less  than  -m.  For  an  n-dimensional  rectangular  automaton  A,  define 
the  equivalence  relations  =yi  and  on  the  states  of  A  by  (u,  y)  =a  (w,  z)if(v  ~w  and 
y  z,  and  (u,  y)  =a  z)  iff  u  =  u;  and  y  z.  ■ 

Lemma  3.1  Consider  two  vectors  y,  z  G  M".  Then  y  z  ijffor  every  rectangle  B  C  M", 
we  have  yeBiffzCB.  Moreover,  y  z  Ijffor  every  m-dejinable  rectangle  B  C  R"", 
we  have  y  G  5  ijfz  G  B. 

Theorem  3.1  Let  A  be  an  n~dimensional  rectangular  automaton  with  k  control  modes.  The 
equivalence  relation  is  a  bisimulation  on  the  discrete-time  transition  system  5^“*^.  If  A 
is  m-dejinable  and  either  positive  or  bounded,  then  is  also  a  bisimulation  on  5^"^  The 
number  of  equivalence  classes  of=^  is  k  •  (4m  4-  3)”. 

Proof.  We  argue  that  is  a  bisimulation  for  positive  m-definable  A;  the  other  parts 
of  the  proof  are  similar.  Suppose  that  (v,y)  {w,z)  and  {v,y)  -^(v' ,y').  We  must 

show  that  there  exists  a  state  {w',  z')  such  that  (w,  z)  A  (w\z')  and  {v',  y') 

First,  assume  that  <t  G  i7.  In  this  case  there  exists  a  control  switch  e  with  source  v  -  w 
such  that  event{e)  =  a  and  (y,yO  €  ljump{e)},  and  yi  =  yj  for  each  i  ^  updat€{e). 
Define  z'  by  z'  =  z,-  for  i  ^  update{€),  and  z,-  =  y^-  for  i  G  update{e).  By  Lemma  3.1, 
(z,  W)  G  [/ump(e)l  and  z'  G  [znu(7;01.  It  follows  that  {w,  z)  A  (u',  z'). 

Second,  assume  that  cr  =  1  (cf.  Fig.  1).  In  this  case  u'  =  u  =  u;,  andy^-y  G  lflow{v)}. 
We  must  show  that  there  exists  a  vector  z!  such  that  z'  -  z  G  lPow{v)j  and  y'  z'  (notice 
that  by  Lemma  3.1,  y^  z'  implies  z'  G  [mu(v)]).  We  do  this  one  coordinate  at  a  time. 

Fix  2  G  { 1 _ _  n}.  Suppose  that  y*  >  m.  It  follows  that  y(  >  m  and  z,-  >  m,  because  A  is 

positive.  Choose  any  c  G  lflow{v)}i,  and  define  =  Zi  -f  c.  Since  c  >  0,  we  have  yJ  zf 
Now  suppose  that  yi  <  m.  If  y,-  G  H  then  Zi  =  yi,  because  yi  z,-.  Define  z-  =  y-.  Then 
e  lflow(v)}i.  If  yi  ^  Nthen  [y^J  <  yi,Zi  <  fy,].  The  set  [/?ou;(t;)li  is  an 
interval,  say,  with  endpoints  a,  6  G  N  (it  is  easy  to  extend  the  argument  to  the  case  b  =  co). 
Thus  [/?otr(2'’)]i  contains  the  open  interval  (a,  b),  and  y'-  G  [yi  +  ct,  Vi  +  b].  We  show  that 
there  exists  a  number  c  G  (a,  b)  such  that  y,'  Zi  +  c.  Since  a,6  G  N  and  yi  Zi,  it 
follows  that  iji  -h  a  ~i  Zi a  and  yi  A-b  Zi  -f*  6,  Thus  the  closed  interval  [zi  +  a,  Zi  +  6] 
intersects  the  same  -equivalence  classes  as  does  [yi  A  a,  yi  -f  b].  Since  neither  zi  +  a  nor 
Zi  +  6  is  an  integer,  the  same  is  true  for  the  open  interval  (z,  +  a,  Zi  +  6).  Therefore  there 
exists  a  number  c  G  (a,  b)  such  that  yJ  Zi  4-  c.  ■ 
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Fig.  1.  Given  a  control  mode  v,  consider  the  flow  condition =  (1  <±1  <3  A  1  <  ir2  <  2). 
Let  5  =  |3  <  a;i  <  4  A  2  <  X2  <  Sj  and  P  =  [0  <  <  3  A  0  <  xi  <  2].  Then 

Pre\{{v]  X  P)  =  {v}  X  P. 

Corollary  3.1  For  every  rectangular  automaton  A,  the  boolean  combinations  of  rectangu¬ 
lar  state  predicates  for  A  form  a  symbolic  execution  theory  for  the  discrete-time  transition 
system 

Corollary  3.2  For  positive  or  bounded  rectangular  automata,  the  discrete-time  safety  ver¬ 
ification  problem  (in  fact,  LTL  and  CTL  model  checking)  can  be  solved  in  PSPACE,  and  the 
discrete-time  safety  controller  synthesis  problem  can  be  solved  in  EXPTIME. 

The  LTL  and  CTL  parts  of  the  corollary  follow  from  the  facts  that  both  model-checking 
problems  can  be  solved  in  space  logarithmic  in  the  size  of  the  transition  system  and  polyno¬ 
mial  in  the  size  of  the  temporal  formula  [Kup95].  It  should  be  noted  that  while  in  the  same 
complexity  class,  the  actual  running  times  of  the  discrete-time  algorithms  for  rectangular 
automata  are  better  by  a  multiplicative  exponential  factor  than  the  running  times  of  the 
corresponding  dense-time  algorithms  for  timed  automata.  This  is  because  there,  the  number 
of  equivalence  classes  of  the  bisimilarity  quotient  is  12  (Ar  n!  •  (m-h  l)”).By  providing  tight 
lower  bounds,  the  following  theorem  shows  that  our  algorithms  are  optimal.  The  second  part 
of  the  theorem  follows  from  Theorem  3.4  below. 

Theorem  3.2  For  bounded  timed  automata,  the  discrete-time  safety  verification  problem  is 
hard  for  PSPACE  [AD94  ],  and  the  discrete-time  safety  control  decision  problem  is  hard  for 
EXPTIME. 

3.2  Sampling-Controller  Synthesis 

The  dense-time  and  discrete-time  control  problems  are  not  realistic,  as  a  controller  may 
enforce  arbitrarily  many  (even  infinitely  many)  consecutive  instantaneous  jumps.  A  more 
natural  control  model  for  hybrid  systems  involves  a  controller  that  samples  the  plant  state 
once  per  time  unit,  and  then  issues  a  command  based  upon  its  measurement.  The  command 
may  cause  a  switch  in  control  mode,  after  which  the  plant  state  evolves  continuously  for 
one  time  unit,  before  receiving  the  next  command.  We  call  this  model  “sampling  control"  to 
distinguish  it  from  discrete-time  control.  Moreover,  we  wish  to  ensure  that  a  proposition  is 
avoided  not  only  at  the  sampling  points  but  also  between  sampling  points.  Given  a  rectangular 
automaton  .4,  we  define  a  third  transition  system,  such  that  (1)  any  control  map 

behaves  in  a  sampling  manner  and  (2)  the  propositional  regions  arc  “large  enough”  so  that 
they  cannot  be  entered  and  left  by  a  single  flow  transition  of  duration  1.  For  example,  if  tt  is 
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a  rectangular  state  predicate  that  maps  each  control  mode  of  A  to  either  true  or  false,  then 
is  large  enough.  If  the  region  of  unsafe  states  is  not  large  enough,  this  may  be  correctable 
by  increasing  the  sampling  rate  (i.e.,  by  reducing  the  unit  of  time). 

Definition  3.2  [Sampling  control]  Let  ^  be  a  rectangular  automaton.  A  rectangular  state 
predicate  tt  e  U  is  large  enough  for  A  if  there  are  no  three^ states  (v,y),(t^,yO  i 
R,  and  e  Rn  such  that  (u,  y)  4  and  {v,y^')  {v,y')  for  some  real 

6  E  (0,  !)•  Define  77'  C  /I  to  be  the  set  of  rectangular  state  predicates  that  are  large 
enough  for  A,  and  define  ((u,y),A)  tt  iff  (u,y)  |=  tt.  The  sampling-control  tran¬ 
sition  system  of  A  is  defined  by  =  (Q  x  {control,  plant] ,  E  U  {1},  x 

[control],  n' ,  1='),  where  the  binary  relation  =>  is  defined  by:  (1)  for  each  event  cr  ^  E, 
we  have  ({v,  y),  control)  ^  ,y‘), plant)  iff  (t^,y)  — >•  ('y^y^)»  and  (2)  ((v,  y), plant)  ^ 

{{v' ,y'), control)  iff  (t;,y)  4  (v',/)-  Thus  in  the  sampling-control  transition  system  the 
controller  and  the  plant  take  turns:  first  the  controller  specifies  a  jump  transition,  then 
one  time  unit  passes  in  a  flow  transition,  and  so  on.  We  refer  to  the  safety  control  deci¬ 
sion  problem  for  transition  systems  of  the  form  for  some  rectangular  automaton  A, 

as  the  sampling-control  decision  problem  for  rectangular  automata,  and  similarly  for  the 
sampling-controller  synthesis  problem.  ■ 

Theorem  3.3  For  positive  or  bounded  rectangular  automata,  the  sampling-controller  syn¬ 
thesis  problem  can  be  solved  in  EXPTIME. 

Proof.  Consider  an  n-dimensional  positive  or  bounded  rectangular  automaton  A.  We  re¬ 
duce  the  sampling-control  problems  to  discrete-time  control  problems  by  constructing  a 
rectangular  automaton  Ctrl(A)  such  that  is  isomorphic  to  Moreover,  if 

A  is  positive,  then  Ctrl{A)  is  positive,  and  if  A  is  bounded,  then  Ctrl(A)  is  bounded. 
Let  Xctri{A)  =  u  {xn+\]  for  a  clock  Xn+i  ^  The  control  graph  and  events 
of  Ctrl(A)  are  identical  to  those  of  A.  Let  inv  ctri{A){y)  =  ^^^Ai^)  A  0  <  Xn+]  <  L 
\Qt  initctri{A){v)  =  initA(v)  A  x^+i  -  L  let  jump =  jump^(e)  A  = 
1  A  =  0,  and  let  flow —  fl^'^Ai'^)  A  Xn+\  =  L  It  follows  that  in  the 
discrete-time  transition  system  jump  transitions  must  alternate  with  flow  transi¬ 

tions  (of  duration  1).  Hence  the  map  /  :  Qctri(A)  Qa  x  [control,  plant],  defined  by 
f{v,y,0)  =  [v,  y,  plant)  and  f{v,  y,  1)  =  {v,  y,  control),  is  an  isomorphism  between  the 
transition  systems  and  If  A  is  m-definable  with  k  control  modes,  by  The¬ 
orem  3.1,  the  bisimilarity  quotient  of  has  no  more  than  ^  •  {4m -f  3 equivalence 

classes,  which  is  singly  exponential  in  the  size  of  A.  ■ 

Lemma  3.2  Let  G  =  ( 14 ,  Vb ,  V/,  -^)  be  an  AND-OR  graph,  and  let  Rbea  set  of  vertices 
ofG.  Define  the  transition  system  Sg  ~  (14  U  Vb ,  27,  — V/,  {tt},  such  that(l)  y  ]=  tt 
iff  V  e  K  (2)  for  all  OR  states  v  e  Vp,  if  v w  and  v  ^  w',  then  a  =  a',  and  (3)  for  all 
AND  states  v  £Va,  if  and  v^w'  and  w  ^  m',  then  a  ^  cr' .  Then  R  is  alternating 

reachable  in  G  iff  tt  is  not  avoidable  in  Sg- 

Theorem  3.4  For  bounded  timed  automata,  the  sampling-control  decision  problem  is  hard 
for  EXPTIME. 

Proof  sketch.  We  reduce  the  halting  problem  for  alternating  Turing  machines  using  polyno¬ 
mial  space  [CKS8 1]  to  the  sampling-control  decision  problem  for  bounded  timed  automata. 
Let  A7  be  an  alternating  Turing  Machine  with  input  s  so  that  M  uses  space  p{\s\).  Then  M 
accepts  s  iff  the  unique  final  state  uf  is  alternating  reachable  in  an  AND-OR  graph  whose  ver¬ 
tices  are  configurations  of  M.  The  set  of  configurations  of  M  is  x  { 1 , . . . ,  p(  |s|) }  x 


where  U  is  the  state  set  of  M,  the  secorid  component  of  the  product  gives  the  position  of 
the  tape  head,  and  F  is  the  tape  alphabet.  Without  loss  of  generality,  we  assume  that 
r  =  {0,  1 , 2},  where  0  is  the  “blank”  symbol.  We  first  define  a  bounded  positive  rectangular 
automaton  A  whose  states  are  configurations  of  M,  and  a  proposition  tt^’,  large  enough 
for  A,  that  is  true  exactly  in  the  configurations  containing  up-  This  is  done  in  a  way  consis¬ 
tent  with  Lemma  3.2,  so  that  Tzp  is  not  avoidable  in  iff  M  accepts  s.  Then  we  turn 


A  into  a  bounded  timed  automaton. 

The  automaton  A  uses  p{\s\)  variables  xi,...,  a:p(|,|)  to  store  the  tape  contents.  The 
set  of  control  modes  of  A  is  ^7  x  { 1 , . ,  p(\s\)}.  The  invariant  and  flow  conditions  are 

constant  functions:  inv{u,i)  =  Aj=i  flow{u,i)  — 

for  all  u  and  i:  thus  flow  transitions  have  no  effect.  The  initial  condition  is  defined  by 


init{uA)  =  false  except  when  u  is  the  initial  state  uj  of  M  and  f  =  1;  in  that  case, 
imt{ui,  1)  =  =  ^j)  ^  A^=(]|+i(®i  =  0).  Each  transition  f  of  M  consists  of  a 

source  state  ueU.a.  tape  symbol  7  €  T,  and  a  list  of  triples  {uj ,  jj ,  dj),  where  uj  eU  is  a. 
target  state,  jj  G  F  is  written  on  the  current  tape  cell,  and  dj  €{-1,1}  gives  the  direction 
moved  by  the  tape  head  (there  is  exactly  one  transition  for  each  source  state  u).  For  every 
transition  t  =  (u,  7,  (uj,  7j,  of  every  tape  position  1  <  f  <  p(|s|),  and  every 

j  e  J,  we  define  in  A  a  control  switch  et,ij  with  source  (u,  i)  and  target  [ujA  -h  dj).  The 
jump  condition  is  a:,-  =  7  A  =  'yj  A  A/f^i(^it  =  If  ^  is  an  AND  state 

of  M,  then  event{et,ij)  =  (u,  i,j).  If  u  is  an  OR  state  of  M,  then  ev€nt{et,ij)  =  0.  To 
turn  A  into  a  timed  automaton,  all  variables  are  replaced  by  clocks,  and  between  any  two 


control  switches  of  A,  a  sequence  of  p(|s|)  control  switches  is  added,  one  for  each  clock,  to 


subtract  p( I s|)  -b  1  from  each  clock  value.  ■ 


4  Beyond  Rectangular  Automata 

Discrete-Time  Undecidability  Results  We  show  that  the  pleasant  properties  of  discrete¬ 
time  rectangular  automata  (Theorem  3.1)  depend  on  both  conditions,  ( 1 )  positivity  or  bound¬ 
edness  and  (2)  rectangularity.  If  either  condition  is  violated,  then  already  the  discrete-time 
safety  verification  problem  becomes  undecidable. 

Definition  4.1  [Triangular  automaton]  A  triangular  automaton  A  has  the  same  compo¬ 
nents  as  a  rectangular  automaton,  except  that  the  predicates  defining  A  may  be  triangular 
predicates,  and  need  not  necessarily  be  rectangular.  ■ 

Theorem  4.1  The  discrete-time  safety  verification  problem  (and  thus  the  discrete-time  con¬ 
trol  decision  problem)  is  undecidable  for  the  class  of  all  rectangular  automata,  and  also  for 
the  class  of  bounded  positive  triangular  automata. 

Proof  sketch.  Both  parts  use  a  reduction  from  the  halting  problem  for  two-counter  machines. 
For  the  first  part,  the  reduction  is  simple,  as  counter  values  can  be  represented  by  variable 
values,  as  in  [KPSY93].  For  the  second  part,  counter  values  must  be  encoded,  so  that  the 
counter  value  c  corresponds  to  the  variable  value  For  this  purpose,  the  wrapping-clock 
technique  of  [HKPV95]  can  be  modified  as  follows.  The  set  {a:i, . . . ,  a:„}  of  dense-time 
clocks  used  for  encoding  counter  values  is  simulated  in  discrete  time  by  variables  with  the 
triangular  flow  condition  xj  =  •  •  •  =  ir„.  Then  the  variables  are  enforced  to  represent  valid 
encodings  at  those  integer  times  when  the  wrapping  clock  shows  0,  ■ 
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Generalized  Rectangular  Automata  It  is  well-known  that  the  pleasant  properties  of  timed 
automata  (Theorem  2,5)  are  preserved  if  rectangularity  is  relaxed  to  triangularity  in  invari¬ 
ant,  initial,  and  jump  conditions.  We  conclude  with  a  similar  observation  for  rectangular 
automata.  A  generalized  rectangular  automaton  is  a  triangular  automaton  whose  flow  con¬ 
ditions  are  rectangular  predicates.  It  follows  from  our  arguments  that  for  every  generalized 
rectangular  automaton  A,  the  boolean  combinations  of  triangular  state  predicates  for  A  form 
a  symbolic  execution  theory  for  the  discrete-time  transition  system  5^”^.  Consequently,  if 
A  is  a  bounded  generalized  rectangular  automaton,  then  has  a  finite  bisimilarity  quo¬ 
tient  (which  is  identical  to  the  region  equivalence  of  timed  automata  [AD94],  and  finer  by  a 
multiplicative  exponential  factor  than  the  equivalence  of  Theorem  3.1).  For  such  automata, 
we  can  automatically  synthesize  sampling  controllers  that  avoid  triangular  state  predicates. 
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Abstract,  We  present  the  first  fully  dynamic  algorithm  for  maintaining 
a  minimum  spanning  tree  in  time  o(^/n)  per  operation.  To  be  precise,  the 
algorithm  uses  log  n)  amortized  time  per  update  operation.  The 

algorithm  is  fairly  simple  and  deterministic.  An  immediate  consequence 
is  the  first  fully  dynamic  deterministic  algorithm  for  maintaining  con¬ 
nectivity  and,  bipartiteness  in  amortized  time  0{n}^^  ^ogn)  per  update, 
with  0(1)  worst  case  time  per  query. 

1  Introduction 

We  consider  the  problem  of  maintaining  a  minimum  spanning  tree  during  an 
arbitrary  sequence  of  edge  insertions  and  deletions.  Given  an  n- vertex  graph  G 
with  edge  weights,  the  fully  dynamic  minimum  spanning  tree  problem  is  to  main¬ 
tain  a  minimum  spanning  forest  F  under  an  arbitrary  sequence  of  the  following 
update  operations: 

insert(u,v):  Add  the  edge  {u,v}  to  G.  Add  to  F  if  this  reduces  the  cost 

of  F,  and  return  the  edge  of  F  that  has  been  replaced. 
delete(u,v):  Remove  the  edge  {u,  v}  from  G.  If  {u,  t;}  G  F,  then  (a)  remove  {u,  v} 
from  F  and  (b)  return  the  minimum-cost  edge  e  of  G  \  F  that  reconnects  F 
if  e  exists  or  return  null  if  e  does  not  exist. 

In  addition,  the  data  structure  permits  the  following  type  of  query: 

connected(u,v) :  Determine  if  vertices  u  and  v  are  connected. 

In  1985  [7],  Fredrickson  introduced  a  data  structure  known  as  topology  trees 
for  the  fully  dynamic  minimum  spanning  tree  problem  with  a  worst  case  cost  of 
per  update  His  data  structure  permitted  connectivity  queries  to  be  an¬ 
swered  in  0(1)  time.  In  1992,  Eppstein  et.  al.  [3,  4]  improved  the  update  time  to 
0{y/n)  using  the  sparsification  technique.  If  only  edge  insertions  are  allowed,  the 
Sleator-Tarjan  dynamic  tree  data  structure  [13]  maintains  the  minimum  span¬ 
ning  forest  in  time  O(logn)  per  insertion  or  query.  If  only  edge  deletions  are 
allowed  (“deletions-only”),  then  no  algorithm  faster  than  the  0{y/n)  fully  dy¬ 
namic  algorithm  was  known. 
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Using  randomization,  it  was  recently  shown  that  the  fully  dynamic  connec¬ 
tivity  problem,  i.e.,  the  restricted  problem  where  all  edge  costs  are  the  same,  can 
be  solved  in  amortized  time  0(log^  n)  per  update  and  O(logn)  per  connectivity 
query  [9,  10].  However,  this  approach  could  not  be  extended  to  arbitrary  edge 
weights,  leaving  the  question  open  as  to  whether  the  fully  dynamic  minimum 
spanning  tree  problem  can  be  solved  in  time  o{\/n). 

In  this  paper  we  give  a  positive  answer  to  this  question:  We  present  a  fully 
dynamic  minimum  spanning  tree  data  structure  that  uses  0(n^/^logn)  amor¬ 
tized  time  per  update  and  0(1)  worst  case  time  per  query  when  update  time  is 
averaged  over  any  sequence  of  Q{min)  updates,  for  ruin  the  initial  size  of  the 
graph.  Our  technique  is  very  different  from  [7]. 

The  result  is  achieved  in  two  steps:  First,  we  give  a  deletions-only  minimum 
spanning  tree  algorithm  that  uses  0{rn!^^^  logn  +  n^)  amortized  time  per  update 
and  0(1)  worst  case  time  per  query  when  the  update  time  is  averaged  over  any 
sequence  of  f2{min)  updates.  Here  t  is  any  constant  such  that  0  <  e  <  1/3,  and 
m'  is  the  number  of  nontree  edges  at  the  time  of  the  update. 

Then  we  present  a  general  technique  which,  given  a  deletions-only  minimum 
spanning  tree  data  structure  with  a  certain  property,  generates  a  fully  dynamic 
data  structure  with  the  same  running  time  as  the  deletions-only  data  structure. 
Let  /(m',  n)  be  the  amortized  time  per  deletion  in  the  deletions-only  data  struc¬ 
ture  with  m'  nontree  edges  and  n  vertices.  The  property  required  is  that,  upon 
inserting  into  the  graph  no  more  than  m'  edges  at  the  same  time  (a  “batch 
insertion”),  the  deletions-only  data  structure  can  be  modified  to  reflect  these 
insertions  and  up  to  m'  subsequent  deletions  can  be  performed  in  a  total  of 
0(m'/(m',n))  time. 

Using  this  technique,  we  develop  a  fully  dynamic  minimum  spanning  tree 
algorithm  with  amortized  time  per  update  of  0(m^/^  log  n),  for  a  sequence  of 
updates  of  length  i7(m^n),  where  m  is  the  size  of  G  at  the  time  of  the  update.  In 
other  words,  letting  m(i)  denote  the  size  of  G  (vertices  and  edges)  after  update 
z,  the  total  amount  of  work  for  processing  a  sequence  of  updates  of  length  I  is 
^(ELo  logn).  We  then  apply  sparsification  [3,  4]  to  reduce  the  running 

time  for  the  sequence  to  logn). 

Our  result  immediately  gives  faster  deterministic  fully  dynamic  algorithms 
for  the  following  problems:  connectivity,  bipartiteness,  fc-edge  witness,  maximal 
spanning  forest  decomposition,  and  Euclidean  minimum  spanning  tree.  See  [9] 
for  all  but  the  last  reduction;  see  Eppstein  [2]  for  the  last  reduction.  For  these 
problems,  the  new  algorithm  achieves  an  0(n^/®/  logn)  factor  improvement  over 
the  previously  best  deterministic  running  time.  If  randomization  is  allowed,  how¬ 
ever,  much  faster  times  are  achievable  [9,  10]. 

Additionally,  improvements  can  be  achieved  in  the  following  static  problems 
(see  [4,  3]):  randomly  sampling  spanning  forests  of  a  given  graph  [6];  finding  a 
color-constrained  minimum  spanning  tree  [8]. 

The  paper  is  structured  as  follows:  In  Section  2  we  give  a  deletions-only 
minimum  spanning  tree  algorithm.  In  Section  3,  we  show  how  to  use  a  sequence 
of  deletions-only  data  structures  to  create  a  fully  dynamic  data  structure. 
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2  Maintaining  a  minimum  spanning  tree-deletions-only 

In  this  section,  we  give  an  algorithm  which  maintains  a  minimum  spanning  tree 
while  edges  are  being  deleted.  The  amortized  update  time  is  log  n)  and 

the  query  time  is  0(1)  for  queries  of  the  form  “Are  vertices  i  and  j  connected?”. 
Let  G  =  {V,E)  be  an  undirected  graph  with  edge  weights.  Without  loss  of 
generality,  we  assume  that  edge  weights  are  distinct. 

Initially,  we  compute  the  minimum  spanning  forest  F  oi  G.  Let  be  the 
number  of  nontree  edges  in  G  initially  and  k  —  logn.  We  sort  the  nontree 
edges  by  weight  and  partition  them  into  m'injk  levels  of  size  k  so  that  the  k 
lightest  are  in  level  0,  the  next  k  lightest  are  in  level  1  and  so  on.  The  set  of  edges 
in  a  level  i  is  denoted  by  Ei.  In  addition,  all  tree  edges  of  the  initial  minimum 
spanning  forest  F  are  placed  in  level  0. 

Throughout  the  algorithm,  the  level  of  an  edge  remains  unchanged,  and  F 
denotes  the  minimum  spanning  forest.  For  i  =  0, 1, ...,  {m'in/k)  - 1,  let  Fi  denote 
the  minimum  spanning  forest  of  the  graph  with  vertex  set  V  and  edgeset  Uj<iEj. 
(Initially,  all  Fi  F,  but  in  later  stages,  an  edge  from  any  level  may  become 
a  tree  edge.  Thus,  Fq  C  Fi  C  . . .  =  F.)  Let  Ti{x)  denote  the  tree 

in  Fi  which  contains  x  and  let  T{x)  without  the  subscript  denote  the  tree  in  F 
containing  x. 

The  main  idea  is  the  following.  If  a  nontree  edge  is  deleted,  then  the  minimum 
spanning  forest  F  is  unchanged.  Suppose  a  tree  edge  {n,  r;}  in  level  i  is  deleted. 
Then  for  each  Fj,  j  >i,  the  deletion  splits  the  tree  in  Fj  containing  u  and  v  into 
Tj{u)  and  Tj{v).  We  search  for  the  minimum  weight  nontree  edge  e  (called  the 
“replacement  edge”)  that  connects  T(u)  and  T{v)  by  gathering  and  then  testing 
a  set  S  of  candidate  edges  on  level  i.  If  none  is  found,  we  repeat  the  procedure 
on  level  z  +  1,  etc.  until  one  is  found  or  all  levels  are  exhausted.  We  now  describe 
the  update  operations: 

delete(u,v):  Delete  edge  {u,^;}  from  any  data  structures  in  which  it  occurs.  If 
a  tree  edge  {u^v]  from  level  i  is  deleted,  then  remove  {u,u}  from  F  and  search 
for  a  replacement  by  calling  Replace(^,  u,  u).  We  refer  to  i  as  the  level  of  the 
call  to  Replace. 

In  the  algorithm  below,  the  subroutine  Search  when  applied  to  a  tree  in  Fi 
finds  all  nontree  edges  in  level  i  which  are  incident  to  the  tree.  A  phase  consists  of 
the  examination  of  a  single  edge.  (Its  exact  definition  and  the  details  of  Search 
are  given  in  Section  2.2  below.) 

Replace(L  w,  u) 

1.  Alternating  in  lockstep,  one  phase  at  a  time,  Search(ri(u))  and  Search(Fi(?;)) 

until  A:/  logn  phases  are  executed  (Case  A)  or  one  of  the  searches  has  stopped 

(Case  B). 

-  Case  A:  Let  S  be  the  set  of  all  nontree  edges  in  level  i. 

-  Case  B:  Let  S  be  the  set  of  (nontree)  edges  produced  by  the  Search 
that  stopped. 
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2.  Test  every  edge  in  S  to  see  if  it  connects  T{u)  and  T{v). 

-  If  a  connecting  edge  is  found,  insert  the  minimum  weight  connecting 
edge  into  F  and  the  data  structures  representing  the  Fj,  j  >i. 

-  Else  if  i  is  not  the  last  level,  call  Replace(i  +  l,u,v). 


2.1  Data  Structures 

The  idea  here  is  to  use  the  ET-tree  data  structure  developed  in  [9]:  (1)  to  rep¬ 
resent  and  update  each  tree  in  F,  so  that  in  constant  time,  we  can  quickly  test 
if  a  given  edge  joins  two  trees;  and  (2)  to  represent  each  tree  in  an  F^  in  such 
a  way  that  we  can  quickly  retrieve  nontree  edges  in  Ei  which  are  incident  to 
the  tree.  To  avoid  excessive  cost,  we  explicitly  maintain  only  those  Fi  where  i 
is  a  multiple  of  m'J^^/logn.  An  unpleasant  consequence  of  this  is  that  when 
retrieving  nontree  edges  in  Ei,  other  nontree  edges  are  also  retrieved. 

Below,  we  refer  to  input  graph  vertices  as  "vertices”  and  use  “node”  to  mean 
a  nodes  of  the  B-tree  in  which  we  store  the  “ET-sequences.” 

ET-trees:  An  ET-sequence  is  a  sequence  generated  from  a  tree  by  listing  each 
vertex  each  time  it  is  encountered  (“an  occurrence  of  the  vertex”)  as  a  tree  is 
searched  depth-first.  Each  ET-sequence  is  stored  in  a  B-tree  of  degree  d.  This 
allows  us  to  implement  the  deletion  or  insertion  of  an  edge  in  the  forest  as  follows: 
we  split  a  tree  by  deleting  an  edge  or  join  two  trees  by  inserting  an  edge  in  time 
0[d\og^n),  using  a  constant  number  of  splits  and  joins  on  the  corresponding 
B-trees.  Also  we  can  test  two  vertices  of  the  forest  to  determine  whether  they 
are  in  the  same  tree  in  time  O(log^n).  See  for  example  [1,  11]  for  operations  on 
B-trees.  If  d  =  n"",  for  a  a  positive  constant,  then  the  join  and  split  operations 
take  time  0{d)  and  the  test  operation  takes  time  0(1).  We  refer  to  the  B-trees 
used  to  store  ET-sequences  as  ET-trees. 

This  data  structure  allows  us  to  keep  information  about  a  vertex  so  that 
the  cumulative  information  about  all  vertices  in  a  tree  may  be  maintained.  For 
example,  we  may  keep  the  number  of  nontree  edges  incident  to  a  vertex  at 
one  designated  occurrence  of  the  vertex.  Then  each  internal  node  of  the  ET-tree 
stores  the  sum  of  the  numbers  of  nontree  edges  kept  with  designated  occurrences 
in  its  subtree.  In  a  degree  d  ET-tree,  each  split  or  join  operation  or  each  change  to 
the  number  associated  with  an  occurrence  requires  the  adjustment  of  0(iog^  n) 
internal  nodes  with  each  adjustment  taking  0{d)  timesteps. 

We  maintain  the  following  data  structures. 

-  Each  edge  is  labelled  by  its  level  and  a  bit  which  indicates  if  it  is  a  tree  edge. 

-  Let  k'  -  logn,n^},  for  any  constant  0  <  e  <  1/3.  Each  tree  in  F 

is  represented  as  an  ET-sequence  which  is  stored  in  a  degree  F  B-tree. 

-  Let  c  ==  We  map  each  level  i  to  the  j  which  is  the  largest 

multiple  of  c  no  greater  than  i  by  the  function  f(i)  —  c[i/cj. 

For  each  level  j  such  that  c\j  (“c  divides  j”): 
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•  we  represent  each  tree  in  Fj  as  an  ET-sequence  which  is  stored  in  a 
binary  B-tree; 

•  for  each  vertex  v,  we  create  a  list  Lj{v)  which  contains: 

(i)  all  nontree  edges  incident  to  v  which  are  in  any  level  i  G  and; 

(ii)  all  tree  edges  incident  to  v  which  are  in  any  level  i>  j,i  e  (j). 

•  We  mark  each  designated  occurrence  of  a  vertex  v  whose  list  Lj{v)  is 
nonempty.  Each  internal  node  of  the  ET-tree  is  marked  if  its  subtree 
contains  a  marked  occurrence. 


2.2  The  Search  routine 

Search (Tj (3:))  returns  all  nontree  edges  in  level  i  incident  to  Ti{x).  It  begins  by 
searching  (x)  which  is  a  subtree  of  Ti{x).  It  proceeds  by  examining  all  edges 
in  Lf(^i){v)  for  all  vertices  v  in  the  tree  being  searched.  Nontree  edges  in  level  i 
are  picked  out  and  tree  edges  in  levels  <i'  <i  are  followed  to  other  trees 

of  F/(q  which  are  then  searched  in  turn.  Note  that  all  such  tree  edges  lead  to 
other  trees  of  which  are  subtrees  of  Ti{x).  A  phase  of  the  algorithm  consists 
of  the  examination  of  one  edge  e  in  a  list  L. 

Search(Ti(u)) 

1.  5"^  0; 

2.  treelist  ^ 

3.  Repeat  until  treelist  is  empty: 

—  Remove  an  ET-tree  from  the  treelist. 

—  For  each  marked  vertex  u  in  the  ET-tree  and  for  each  edge  e  in  each 

•  If  is  a  nontree  edge  on  level  i,  add  it  to  the  set  of  edges  to 

return. 

•  Else  if  {u,  u}  is  a  tree  edge  on  level  I  such  that  I  <  i,  then  add  T/(i)  (u) 
to  treelist. 


2.3  Analysis 

Initialization:  We  compute  the  minimum  spanning  forest  F,  create  the  ET-trees 
for  Fj,  for  each  j  such  that  c\j,  and  partition  the  nontree  edges  by  weight. 
Recall  that  m' in  is  the  number  of  nontree  edges  in  the  initial  graph.  Let  t  be  the 
number  of  edges  in  the  initial  minimum  spanning  forest.  The  creation  of  all  the 
lists  L  takes  time  proportional  to  the  number  of  nontree  edges  m' in  •  The  building 
of  ET-trees  for  F  and  all  Fj  such  that  c\j  and  the  marking  of  internal  nodes 
takes  time  proportional  to  the  size  of  each  forest  or  0{{{m'in/k)/c)t  -f  m'in)  — 
0{m'll^t  -f 

Deletions  of  nontree  edges:  Deleting  a  nontree  edge  on  any  level  may  require 
resetting  the  bit  of  an  occurrence  of  a  vertex  in  some  ET-tree,  which  may  require 
resetting  bits  on  all  internal  nodes  on  the  path  to  the  root  in  O(logn)  time. 
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Deletions  and  insertions  of  tree  edges:  Deleting  a  tree  edge  takes  0{k  )  time  to 
delete  it  from  the  ET-tree  of  F  and  O(logn)  time  to  delete  it  from  the  ET-tree 
of  each  Fj  such  that  c\j,  for  a  total  of  0{k'  +  {(m't„/fc)/c)  logn)  time  per  edge. 
Inserting  a  replacement  edge  takes  the  same  time. 

Finding  a  replacement  edge:  We  first  analyze  the  cost  of  Search.  Let  the  weight 
w{T)  of  a  tree  T  of  some  Fi  be  \Lf(^i){v)\  summed  over  all  vertices  u  in  T.  It 
costs  O(logn)  to  move  down  the  path  from  the  root  to  a  leaf  in  an  ET-tree  to 
find  a  marked  occurrence  of  a  vertex,  or  to  move  up  a  tree  from  an  occurrence  to 
the  root.  Thus,  the  cost  of  Search(Ti(a;))  is  O(logn)  times  the  number  of  edges 
examined,  or  0{w{Ti{x))  logn),  if  Search  is  carried  out  until  it  ends,  and  0{k) 
if  it  is  run  for  fc/logn  phases. 

In  Replace(n,u,i),  if  w{Ti{u))  <  w(T^{v)),  then  we  refer  to  Ti{u)  as  the 
smaller  component  Ti;  otherwise  Ti  is  Ti{v).  The  cost  of  a  call  to  Replace('U,  v,  i) 
is  the  cost  of  the  Search  plus  the  cost  of  testing  each  edge  in  5.  The  number  of 
edges  in  S  is  0(min{fc,  w{Ti)}).  We  may  use  the  A;'-degree  ET-tree  representation 
for  F  to  test  each  edge  at  cost  0(1).  Thus  the  cost  of  a  call  to  Replace  is 
0{min{k,  w{Ti)  log  n}). 

To  pay  for  these  costs:  We  charge  the  cost  of  a  call  to  Replace{u,  v,i)  to  level 
i  if  no  replacement  edge  is  found  on  that  level.  In  that  case,  a  tree  of  Fi  which  was 
split  by  the  deletion  remains  split.  Otherwise,  we  charge  the  cost  to  the  deletion. 
In  addition,  we  charge  the  cost  of  modifying  F  to  the  deletion  so  the  total  cost 
charged  to  the  deletion  is  O (min {A:, it; (Ti) logn}  -f  [[m' in/k)fc)\ogn  -\-  k)  = 
0{{{mUn/k)lc)  logn  +  k'). 

Claim  1  0(X)^(^i))  summed  over  all  smaller  components  Ti  which  split  from 
a  tree  T  on  any  given  level  during  all  Replace  operations  is  0{w{T)\ogn). 

The  proof  of  the  claim  is  not  hard  and  follows  [5].  The  details  are  omitted 
here. 

There  are  at  most  k  edges  per  level  (except  for  level  0,  which  has  at  most 
k  nontree  edges).  Each  Lj{v)  consists  of  edges  from  c  levels.  Since  level  0  tree 
edges  do  not  belong  to  any  list  Lj{v),  the  maximum  weight  of  a  tree  w{T)  is  ck. 
Thus  the  total  cost  charged  to  a  level  is  0{ck  log^  n).  Summing  over  all  levels  we 
have  0{{Tn'in/k){ck\og^  n)  =  0(m'inclog^  n),  or  an  amortized  cost  per  deletion 
of  0(ciog^  n)  =  logn),  if  f?(m'in)  edges  are  deleted. 

The  cost  charged  to  each  deletion  is  0((m'in/<^k)(logn)  -h  k').  For  k'  = 
max{m'-/^logn,n^}  and  c  =  log this  is  logn -t  n^). 

To  summarize  the  cost  of  initialization  when  amortized  over  f7(min)  operations 
is  and  the  cost  per  deletion  of  an  edge  and  finding  replacement  edges, 

when  amortized  over  ^?(m'^n))  operations  is  0(m'-^  logn  -h  n').  Thus  for  a 
sequence  of  ^2(m^n)  operations,  the  amortized  time  per  update  is  0(m' logn  + 
n^). 

Finally,  we  note  that  the  query  of  the  form  “Are  nodes  i  and  j  connected?” 
may  be  answered  using  the  ET-tree  data  structure  for  F  in  0(1)  time. 
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3  Allowing  insertions 

As  in  the  previous  section,  we  assume  all  edge  weights  are  unique.  We  refer  to 
the  current  minimum  spanning  forest  of  G  as  the  MST.  Let  m’  be  the  number 
of  nontree  edges  in  the  current  graph. 

Let  S'  =  [Ig  m'] .  Initially,  we  build  and  maintain  s  simultaneous  deletions-only 
data  structures  As_i, Ai  and  a  set  of  edges  B.  We  call  this  the  composite 
data  structure.  We  maintain  the  MST  in  a  Sleator-Tarjan  dynamic  tree  [13]  and 
also  in  an  ET-tree  of  degree  max{m'^/^  logn,n^}. 

Below,  we  distinguish  between  the  number  of  edges  inserted  into  G  and  the 
number  of  edge  insertions  into  B,  as  an  edge  of  G  may  be  inserted  more  than 
once  into  B  even  though  it  has  not  been  deleted  and  reinserted  into  G.  The 
minimum  spanning  forests  of  the  deletions-only  data  structures  are  referred  to 
as  local  spanning  forests.  A  local  nontree  edge  of  an  Ai  is  an  edge  which  is  not  in 
Ai’s  local  spanning  forest  or  the  MST.  We  will  see  that  every  nontree  edge  of  G 
is  a  local  nontree  edge  of  some  Ai  or  B,  but  may  also  be  a  tree  edge  in  a  local 
spanning  forest  of  an  Aj^j  ^  i. 

Initially  Ag  is  the  deletions-only  data  structure  described  in  the  previous 
section,  with  F  =  MST  and  the  set  of  local  nontree  edges  being  all  nontree 
edges  of  G,  and  the  parameter  set  to  2^  The  set  B  is  empty  and  the 
remaining  Aj,  1  <  J  <  s,  are  initialized  (“built”)  as  though  the  edges  of  the 
MST  were  the  only  edges  in  Aj,  i.e.,  they  contain  no  nontree  edges.  The  set  B 
is  empty. 

For  z  =  1, .  .,,s,  let  mi  =  2\  h  =  mj^^logmi,  and  k'i  =  max{wV^ logn, n"}. 
When  an  edge  is  inserted  into  G,  it  is  placed  into  B  or  into  the  MST. 

Let  Xi  be  the  number  of  local  nontree  edges  in  Uj<i  A^  U  B.  Each  Ai  is  built 
(or  rebuilt)  when  i  is  the  smallest  index  such  that  mi  >  Xi  and  the  number 
of  edges  in  B  has  increased  to  At  that  time,  B  is  emptied  and  all  local 

nontree  edges  ^j<iAj  and  edges  in  B  are  removed  from  Aj,j  <  i,  and  B  and 
placed  into  Ai.  Then  Ai  becomes  the  deletions-only  data  structure  described 
in  the  previous  section,  which  is  initialized  (or  reinitialized)  to  contain  the  tree 
edges  of  the  MST  and  the  local  nontree  edges  previously  contained  in  Uj<iAj 
and  the  edges  B.  Thus,  throughout  the  algorithm,  B  contains  fewer  than 
edges,  i.e.,  the  most  recent  insertions  into  B,  which  have  not  yet  been  added  to 
some  Aj  and  for  j  <  [Igm'^/^J  Aj  never  contains  any  nontree  edges.  These  Aj 
are  maintained  in  the  event  that  they  will  be  used  later,  if  m'  is  reduced. 

To  insert  edge  e  into  G.*  Use  the  dynamic  tree  to  find  the  maximum  weight  edge 
/  on  the  path  between  e’s  endpoints  in  the  MST.  If  e  is  lighter  than  /,  remove 
/  from  the  MST,  and  insert  /  into  B.  Else  insert  e  into  B. 

To  delete  an  edge  e  from  G:  (1)  Delete  e  from  all  data  structures  in  which  it 
appears.  (2)  For  each  Ai  which  contained  e  in  its  local  spanning  forest,  update 
the  A^  by  determining  e’s  local  replacement  edge  e'  (if  there  is  one).  Insert  e' 
into  AiS  local  forest  and  into  if  it  is  not  already  there.  (3)  If  e  was  in  the 
MST,  then  for  each  local  replacement  edge  e'  and  each  edge  in  B,  use  the  ET- 
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tree  representation  of  the  MST  to  determine  which  of  those  edges  connect  the 
two  subtrees  resulting  from  the  deletion  of  e.  Insert  the  lightest  connecting  edge 
into  the  MST. 

3.1  Proof  of  correctness 

Our  algorithm  maintains  the  following  invariant: 

Invariant:  Every  edge  in  the  local  forest  of  some  Ai  is  (1)  in  the  MST,  or  (2) 
is  a  local  nontree  edge  in  some  AjJ  or  (3)  is  in  5. 

Lemma  2.  The  invariant  stated  above  holds  throughout  the  execution  of  the  al¬ 
gorithm. 

The  proof  of  the  lemma  is  straightforward  and  is  omitted  here. 

The  correctness  of  the  algorithm  follows  easily  from  the  invariant.  We  use 
the  well-known  fact  that  an  edge  is  in  the  minimum  spanning  tree  iff  it  is  not 
the  heaviest  edge  in  any  cycle  (“red  rule”  [14]).  We  also  note  that  every  edge  in 
the  composite  data  structure  is  an  edge  in  G. 

Let  e  be  an  edge  of  the  MST  which  is  deleted.  Let  e'  be  the  correct  replace¬ 
ment  edge.  Consider  the  state  of  the  composite  data  structures  right  before  the 
deletion  of  e.  By  the  invariant,  since  e'  was  not  in  the  MST,  it  was  a  local  nontree 
edge  in  some  Ai  or  in  B.  If  e'  was  in  B  it  would  be  checked  in  Step  2  above. 

If  e'  was  a  local  nontree  edge  in  Ai,  then  consider  the  subgraph  G'  of  G  whose 
edgeset  consists  of  edges  in  Since  e'  is  the  correct  replacement  edge  for  e  in 
the  MST  then  after  e’s  deletion,  e'  is  not  the  heaviest  edge  in  any  cycle  of  G  and 
therefore  is  not  the  heaviest  edge  of  any  cycle  of  G' .  Hence,  after  e’s  deletion,  e' 
becomes  a  local  forest  edge,  i.e.,  e'  is  a  local  replacement  edge  for  e  in  Ai.  Recall 
that  e'  is  the  minimum  weight  edge  which  connects  the  two  subtrees  of  the  MST 
resulting  from  the  deletion  of  e.  Thus,  e'  is  the  lightest  connecting  edge  from 
among  the  edges  of  B  and  the  set  of  local  replacement  edges,  and  is  chosen  in 
Step  2  by  the  algorithm. 

3.2  Implementation  and  analysis 

At  the  start  of  the  algorithm,  Ai  for  i  <  s  are  built.  After  that,  Ai  ior  i  <  s  may 
be  “rebuilt”.  Depending  of  the  value  of  m',  As+i,  As+2-,"-  be  built  later. 
We  first  consider  the  (one-time)  cost  of  building  the  Afs,  then  the  cost  of  their 
rebuilding,  and  finally  the  cost  of  maintaining  the  Ai  between  rebuilds. 

Initialization  of  the  Ai:  Let  min  be  the  size  of  the  initial  graph  (number  of 
vertices  plus  edges),  let  m[.^  be  the  initial  number  of  nontree  edges,  and  for 
each  operation  let  m  be  the  size  of  the  current  graph.  Recall  that  the  total 
cost  of  initialization  for  a  deletions-only  data  structure  with  mi  nontree  edges  is 
0{my^n  -h  mi)  and  that  we  are  given  a  sequence  of  0{min)  operations. 

We  will  amortize  the  building  of  the  first  \lg  min]  Afs  to  the  sequence  of 
^{rriin)  operations,  even  though  only  s  Afs  are  built  initially.  If  more  than 
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\\gmin\  ^i’s  are  necessary  at  some  point,  we  know  that  at  this  point  m  > 
m^n  and  at  least  min  insertions  happened.  Let  be  the  largest  deletions- 

only  data  structure  built  during  the  execution  of  the  algorithm  where  > 

niin.  Then  there  was  a  sequence  of  operations  during  which  m  was 

and  we  can  amortize  the  initialization  cost  over  these  operations. 

The  total  cost  of  initializing  the  Smax  is  *  2^/^n+2*)  =  0(2^”"“*/^n+ 

rriin).  The  average  cost  over  a  sequence  of  12(2^'"“=' )  operations  is  thus  ^^n), 

which  is  0{Tn}f^)  per  operation. 

Rebuilding:  We  create  ET-trees  for  the  new  Ai  by  modifying  the  ET-trees  for 
the  previous  Ai.  For  each  i,  we  keep  a  list  of  all  changes  made  to  each  ET-tree 
of  Ai  since  the  last  rebuild,  and  a  list  of  all  changes  made  to  the  MST.  We  use 
this  list  to  first  restore  all  the  ET-trees  for  Ai  to  their  previous  state  when  Ai 
was  last  built  or  rebuilt,  MSTom,  by  undoing  each  change,  edge  by  edge.  We 
then  transform  each  MSToid  to  MST,  edge  by  edge. 

The  cost  of  restoring  the  ET-trees  of  Ai  to  MSToid  is  charged  to  operations 
on  the  deletions-only  data  structure  Ai  which  caused  the  initial  change.  This 
results  in  only  a  doubling  of  cost  per  operation,  as  the  cost  for  inserting  a  tree 
edge  into  an  ET-tree  is  the  same  as  for  deleting  a  tree  edge. 

The  cost  of  transforming  MSToid  to  MST  is  charged  to  the  update  op¬ 
eration  that  causes  the  change  in  MST  (each  update  causes  0(1)  changes  in 
MST)  as  follows:  For  each  Ai,  there  are  forests  of  ET-trees  represented 
by  binary  B-trees  and  one  forest  (the  ET-tree  for  the  local  spanning  forest  F) 
represented  by  a  degree- A;'  B-tree.  Thus,  a  single  tree  edge  insertion  or  deletion 
costs  0(771^^  log  n  -t-  k[)  for  Ai.  Note  that  for  each  i  one  change  to  the  MST 
contributes  to  the  cost  of  only  one  rebuild  of  Ai.  The  total  cost  per  change  over 
all  levels  is  logn  +  k[+  logn)  =  logn  -I-  logn). 

Also,  when  Ai  is  rebuilt,  all  local  nontree  edges  from  Aj,j  <  i  and  B  are 
moved  from  Aj  and  inserted  into  Ai.  That  is,  the  edges  are  sorted  by  weight, 
assigned  to  levels  in  Ai,  and  put  in  the  appropriate  list  L.  The  bits  on  the 
internal  nodes  of  ET-trees  for  Aj,j  <  i  are  set  appropriately.  Since  each  local 
nontree  edge  is  stored  in  only  one  ET-tree  on  a  level,  the  cost  of  moving  a  single 
local  nontree  edge  is  O(logn).  Thus,  the  total  cost  is  0{mi  logn).  Since  is 
not  rebuilt,  Xi-i  >  —  mi/2.  We  amortize  this  cost  by  charging  O(logn) 

to  each  edge  in  \Jj<^iAj  \J  B,  i.e.  each  edge  that  is  newly  added  to  Ai.  We  show 
below  (type  (3)  charges)  how  to  amortize  these  costs  over  the  update  operations. 

Maintaining  the  deletions-only  data  structures:  After  a  rebuild  in  Ai  there  are 
at  most  TUi  nontree  edges  in  Ai.  In  Section  2,  we  have  two  types  of  charges: 
(1)  the  cost  charged  to  each  deletion  in  a  deletions-only  data  structure  which  is 
0(mj^^  log n-l-n^)  and  (2)  the  cost  charged  to  all  the  levels  which  is  0(mj^^  logn) 
per  nontree  edge.  Additionally,  the  rebuilding  of  an  Ai  above  charged  O(logn) 
to  each  nontree  edge  in  Ai.  We  call  these  costs  type  (3)  charges. 

Type  (1)  charges:  When  there  is  a  deletion  in  G  in  the  fully  dynamic  data 
structure,  an  edge  (or  one  of  its  copies)  may  be  deleted  from  each  of  Ai,i  = 
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1, . . .  s.  We  may  charge  that  deletion  in  G  with  the  (1)  charges  for  all  levels  for 
a  total  cost  of  0(E^  logn  +  n^)  =  logn  +  logn). 

Type  (2)  and  (3)  charges:  As  a  special  case,  the  charges  incurred  by  the  first 
deletions-only  data  structure  containing  nontree  edges  As  must  be  amortized 
over  the  initial  sequence  of  J7(min)  operations  which  follow  its  initialization. 
Since  each  i  <  s  is  initialized  to  contain  no  nontree  edges,  there  are  no  type 
(2)  and  (3)  for  these  data  structures  until  they  are  rebuilt. 

Note  that  the  A^,  i  >  s  contain  nontree  edges  when  initialized.  The  type(2) 
and  type(3)  charges  for  their  building  and  rebuilding  and  the  rebuilding  of  the 
other  A^  i  <  s  are  amortized  over  the  insertions  which  occurred  previous  to  its 
building  or  rebuilding,  as  analyzed  below. 

Suppose  Ai  is  rebuilt.  Since  Ai^i  was  not  rebuilt,  Xi-i  >  rui^i  =  raiJ2 
at  the  time  of  the  rebuilding  of  A^.  Thus,  insertions  into  B  occurred 

since  the  previous  rebuild  of  and  Q{mi)  of  these  occurred  when  the  graph 
had  Q{mi)  nontree  edges.  Thus,  we  may  charge  each  insertion  into  B  with 
mj/^logn  +  =  0(771^^  logn  +  nMogn)  where  s'  =  2^‘srr^'l  ^here 

m'  is  the  number  of  nontree  edges  in  G  when  the  insertion  occurred. 

To  amortize  costs  over  insertions  into  G,  rather  than  B,  we  use  the  following 
simple  but  crucial  observation:  When  an  edge  is  inserted  into  B  that  edge  may 
contribute  to  the  type  (2)  and  (3)  costs  for  Ai  (when  it  belongs  to  Ai)  iff  it  in¬ 
creases  Xi.  Note  that  nii  >  Xi  >  Xi-i  >  m^-i  —  mi/2.  We  charge  OimJ  logn) 
to  each  local  nontree  edge  inserted  into  Ai+i  to  pay  for  the  type  (2)  and  (3) 
charges  while  the  edges  are  in  Ai. 

We  examine  the  types  of  insertions  into  B  to  see  how  they  affect  xp.  (a) 
when  an  edge  is  first  inserted  into  B,  i.e.,  when  the  edge  is  inserted  into  G;  (b) 
when  an  edge  is  replaced  in  the  MST;  (c)  when  an  edge  is  deleted  in  G  and 
it  is  replaced  in  up  to  s  local  spanning  forests.  The  first  two  cases  result  in  a 
single  insertion  into  B.  The  third  case  may  cause  up  to  s'  insertions.  However, 
the  s  insertions  do  not  affect  all  Ai  the  same.  Each  insertion  in  this  case  results 
from  a  local  nontree  edge  e  becoming  a  local  forest  edge.  Hence  if  this  occurs 
in  some  AjJ<i,  the  increase  of  Xi  resulting  from  the  insertion  of  a  copy  of  e 
into  B  is  offset  by  the  decrease  of  Xi  caused  by  the  change  in  status  of  e  from 
a  local  nontree  edge  to  a  local  tree  edge.  Thus  Xs'  is  unchanged  by  a  case-(c) 
insertion  into  H,  Xs*-i  is  changed  by  at  most  1,  and  in  general,  Xi  is  changed  by 
at  most  s’-i.  The  type  (2)  and  (3)  cost  per  deletion  is  = 

O(Ei("i.'/20^/®logn)  =  OK'/^logn)  . 

Thus  the  deletion  cost  per  update  operation  is  logn  +  logn). 

Insertion  cost:  Testing  a  newly  inserted  edge  to  see  if  it  should  be  added  to 
the  MST  using  the  Sleator-Tarjan  dynamic  trees  is  an  O(logn)  cost  operaton. 
Adding  an  edge  to  B  can  be  done  in  constant  time,  as  B  is  an  unsorted  list. 

Summary:  For  rebuilding  and  maintaining  the  deletions-only  data  structures,  the 
algorithm  achieves  an  amortized  cost  of  Oim'^^^  logn-hn^)  per  update,  where  m 
is  the  number  of  nontree  edges  in  the  graph,  for  processing  a  sequence  of  i7(min) 
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operations,  where  rriin  is  the  initial  size  of  the  graph  (vertices  plus  edges).  For 
the  initializations  of  the  deletions-only  data  structures,  the  amortized  cost  per 
update  is  log n),  where  m  is  the  size  of  the  graph  at  the  time  of  the 

update,  for  a  sequence  of  {rriin)  operations. 

Note:  For  unweighted  graphs,  a  simpler  fully  dynamic  data  structure  can  be 
constructed  which  uses  only  one  deletions-only  data  structure  and  adds  levels  as 
needed.  The  details  are  omitted  here. 
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Efficient  Splitting  and  Merging  Algorithms  for 
Order  Decomposable  Problems 

(Extended  Abstract) 


Roberto  Grossi  *  and  Giuseppe  F.  Italiano  ** 


Abstract.  We  present  a  general  and  novel  technique  for  solving  decom¬ 
posable  problems  on  a  set  S  whose  items  are  sorted  with  respect  to  d  >  1 
total  orders.  We  show  how  to  dynamically  maintain  S  in  the  following 
time  bounds:  O(logp)  for  the  insertion  or  the  deletion  of  a  single  item, 
where  p  is  the  number  of  items  currently  in  5;  for  splits  and 

concatenates  along  any  total  order;  0{p^  plus  an  output  sensitive 
cost  for  rectangular  range  queries.  The  space  required  is  0{p).  We  provide 
several  applications  of  our  technique  ranging  from  two-dimensional  prior¬ 
ity  queues  and  d-dimensional  search  trees  to  concatenable  interval  trees. 
This  allows  us  to  improve  many  previously  known  results  on  decompos¬ 
able  problems  under  split  and  concatenate  operations,  such  as  member¬ 
ship  query,  minimum-weight  item,  range  query,  and  convex  hulls.  Our 
technique  is  suitable  for  efficient  external  memory  implementation. 


1  Introduction 

Let  7^  be  a  searching  problem  defined  on  an  input  set  S  with  p  items,  and  let 
Vix,  S)  denote  its  solution  for  a  query  item  x.  Problem  V  is  decomposable ^if  we 
can  find  an  answer  to  query  V{x,S)  by  first  partitioning  set  S  =  S' U  S"  and 
computing  the  answers  to  queries  V{x,S')  and  'P{x^S")  recursively,  and  then 
combining  them  through  a  suitable  operator  0-  Formally,  V  is  said  to  be  /(p)- 
decomposahle\^^Xi&  only  iiV{x,S)  =  <>[V{x,S'),V{x,S"))  for  any  partition  5  = 
and  any  query  item  x,  where  0  is  an  operator  whose  computation  requires 
0(f{p))  time.  (We  assume  that  function  f{p)  is  smooth,  i.e.,  f{0{p))  =  0(/(p)), 
and  nondecreasing.)  Some  examples  of  0(l)-decomposable  searching  problems 
include:  membership  queries  (with  0  being  the  logical-or  function);  closest  point 
queries  (with  0  the  minimal  distance);  range  queries  (with  0  the  list  append 
operation).  Convex  hull  searching  is  not  decomposable  as  the  fact  that  a  point  x  G 
S  belongs  to  the  convex  hull  of  S'  or  S"  does  not  necessarily  imply  that  x  belongs 
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to  the  convex  hull  of  5  =  S'US".  The  definition  of  decomposable  search  problems 
can  be  extended  also  to  the  decomposable  set  problems  in  which  the  query  item  is 
not  specified  (e.g.,  finding  the  minimum-weight  item,  where  0  is  the  minimum), 
and  we  shall  denote  a  generic  solution  to  a  decomposable  problem  V  by  7^(5').  Let 
d>  I  total  orders  . . . ,  be  defined  on  5,  and  let  -<i  be  a  given  total  order, 
1  ^  ^  <  c?-  A  problem  V  is  f{p)- order  decomposable  with  respect  to  total  order 
-<i  if  P(5)  =  ^{'P{S')^'P{S”))  for  any  ordered  partition  S  =  S'US"  (i.e.,  x'  -<i  x" 
for  all  x'  e  S‘  and  x”  €  5"),  where  operator  0  takes  0{f{p))  time.  Problem  V 
is  f(p) -order  decomposable  if  it  is  /(p)-order  decomposable  with  respect  to  any 
total  order  -<i,  1  <  i  <  d.  Convex  hull  searching  is  0(logp)-order  decomposable. 
Other  examples  of  order  decomposable  problems  include  multidimensional  range 
queries  and  Voronoi  diagrams,  and  many  other  decomposable  problems  in  basic 
data  structures,  computational  geometry,  database  applications  and  statistics  [7, 
17,  21]. 

In  this  paper,  we  present  a  general  technique  for  maintaining  a  dynamic  set 
S  with  d  total  orders,  for  constant  d,  under  insertions  of  a  single  item,  deletions 
of  a  single  item,  and  re-arrangements  of  any  of  the  total  orders  -<i, . . . ,  on 
S  by  means  of  split  and  concatenate  operations.  Our  queries  involve  finding  the 
solution  ViR)  for  only  the  items  in  the  subset  R  Q  S  identified  by  some  ranges 
in  the  orders  -<i, . . . ,  -<d.  More  formally,  we  introduce  the  following  multiordered 
set  splitting  and  merging  problem: 

split{S,  Split  S  into  S'  and  S"  according  to  item  2:  and  the  specified 

total  order  ~<i  {I  <  i  <  d).  That  is,  x'  z  and  z  x”  for  all  x'  G  S'  and 
x"  e  S".  S  is  no  longer  available  after  this  operation. 
concatenate{S' ,  S" , -<'1,  :  Combine  S'  and  S"  together  according  to  their 

respective  f-th  total  orders  and  <'1  <  i  <  d)  into  a  new  set  5  = 

S'US" .  The  items  in  the  resulting  set  S  undergo  the  new  order  -<i  obtained 
by  concatenating  ■<'^  and  -<'1.  That  is,  x  in  5  if  and  only  if  either 
(a)  x  -<'i  y  and  x,y  e  S'\  or  (b)  x  <'1  y  and  x,y  e  S";  or  (c)  x  e  S'  and 
y  e  S".  S'  and  S"  are  no  longer  available  after  this  operation. 
insert{z,  S):  Insert  item  z  into  set  S  according  to  all  orders  Xi, . . . , 
delete{z,  S):  Delete  item  from  set  S. 

range{{ai,bi) , . . . ,  (a^, hd)^S):  Let  R=  {z  E  S  :  ai  z  for  1  <  i  <  d). 

Find  the  solution  V{R)  to  problem  V  restricted  to  region  R  only. 

For  d  =1,  the  recursive  nature  of  order  decomposable  problems  gives  an  imme¬ 
diate  tree  structure,  and  each  of  the  above  operations  can  be  simply  implemented 
in  0{f{p)  logp)  time  by  using  a  2-3-tree  [2].  Maintaining  d  >  1  total  orders  on 
the  same  set  5,  while  splitting  or  merging  each  order  independently  of  the  others, 
makes  things  much  more  complicated  than  this  simple  case.  In  the  case  of  two 
or  more  different  orders,  indeed,  there  are  some  technical  difficulties,  which  are 
mainly  due  to  the  interplay  among  different  orders. 

Related  Work.  Decomposable  problems  were  first  introduced  by  Bentley  [6]  for 
dynamizing  static  data  structures,  while  other  dynamization  techniques  were  in¬ 
troduced  in  [7,  15,  18,  19,  24].  All  these  techniques  rely  on  two  main  methods, 
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the  equal  block  method  [14,  15,  18]  and  the  logarithmic  method  [6,  7,  24],  in  which 
a  big  data  structure  is  decomposed  into  small  data  structures,  called  blocks;  the 
number  of  blocks  is  properly  tuned  so  as  to  obtain  a  good  tradeoff  between  quer¬ 
ies  and  updates.  Some  lower  bounds  on  the  best  possible  tradeoff  were  given 
in  [7,  16].  Optimal  solutions  were  obtained  by  combining  the  equal  block  and  the 
logarithmic  method  by  means  of  the  amortized  solution  in  [19]  and  by  the  global 
rebuilding  technique  yielding  worst-case  bounds  in  [23,  25].  The  notion  of  order 
decomposable  problems  was  first  introduced  in  [20]  by  generalizing  the  results 
of  [22]  and  was  independently  presented  in  [10].  Solving  an  ordered  decompos¬ 
able  problem  only  for  the  items  contained  in  an  input  rectangular  region  can  be 
done  by  range  queries  on  quad-trees  [9]  and  k-d  trees  [5],  but  it  is  difficult  to 
keep  them  balanced  (e.g.,  see  [26,  27]).  Many  other  elegant  data  structures  for 
range  queries  were  devised  subsequently  and  we  refer  the  reader  to  [8]  for  a  com¬ 
prehensive  survey  on  this  topic  and  a  list  of  references.  Among  them,  [28]  and  [29] 
show  how  to  combine  decomposable  problems  and  range  queries  together  so  as 
to  add  some  range  restrictions  to  dynamic  data  structures.  Split  and  concatenate 
operations  were  subsequently  introduced  in  [11,  13]  for  a  set  of  multidimensional 
points  in  addition  to  the  standard  operations:  range  queries,  insertions  and  de¬ 
letions.  Specifically,  the  divided  k-d  trees  [11]  for  a  set  of  p  items  supported 
a  range,  a  split  or  a  concatenate  operation  in  p)  time  and  an 

insertion  or  a  deletion  in  O(logp)  time,  with  0{p)  space.  In  [13],  a  general  tech¬ 
nique,  based  on  the  ordered  equal  block  method,  was  described  for  solving  order 
decomposable  problems  and  producing  efficient  concatenate  data  structures  in 
0{p)  space.  The  following  time  bounds  were  obtained  for  a  split  or  concatenate: 
O(v^logp)  in  concatenate  interval  trees,  logp)  in  d-dimensional  2- 

3-trees  and  0(y/plogp  logp)  in  a  data  structure  for  convex  hulls.  The  bound  for 
insertions  and  deletions  of  items  is  O(logp)  amortized,  except  for  the  O(log^p) 
amortized  bound  in  the  data  structure  for  convex  hulls.  The  range  query  bounds 
equal  the  split /concatenate  cost  plus  an  output  sensitive  cost  O(occ),  where  occ 
is  the  size  of  the  output  reported  by  the  query.  Although  the  range  queries  in  [28] 
and  [29]  are  faster  than  the  ones  in  [13],  the  solutions  in  [13]  support  efficient 
splits  and  concatenates,  require  less  space  and  can  be  used  to  obtain  an  efficient 
dynamic  version  of  static  data  structures. 

Our  results.  In  this  paper,  we  present  a  novel  technique  for  solving  order  de¬ 
composable  problems  on  S  under  insertions,  deletions,  splits,  concatenates  and 
range  queries,  yielding  new  and  efficient  concatenable  data  structures  for  dimen¬ 
sion  d  >  1.  All  these  data  structures  are  based  on  a  new  multidimensional  data 
structure,  which  we  call  the  cross-tree.  Differently  from  the  approach  of  [13],  our 
general  technique  is  based  more  on  simple  geometric  properties  rather  than  on 
underlying  sophisticated  data  structures,  and  exploits  the  fact  that  some  data 
structures  can  be  built  on  sorted  items  more  efficiently.  By  using  our  technique 
we  maintain  a  set  5  of  p  items  in  0(p)  space  with  the  following  worst-case  time 
bounds:  O(logp)  for  the  insertion  or  the  deletion  of  a  single  item,  and  0(p^~^/^) 
for  splits  and  concatenates  along  any  order.  We  use  this  new  technique  in  a 
simple  way  for  a  wide  range  of  applications  to  shave  some  log  factors  from  the 
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best  known  bounds  [11, 13].  We  obtain  new  multidimensional  data  structures  im¬ 
plementing  two-dimensional  priority  queues,  two-dimensional  search  trees,  and 
concatenable  interval  trees.  We  achieve  the  following  time  bounds  for  a  split  or 
concatenate:  0{^/p)  in  concatenable  interval  trees,  in  d-dimensional 

2-3-trees  (or  divided  k-d  trees)  and  0{y/p\ogp)  in  a  data  structure  for  the 
convex  hull.  We  also  improve  the  query  bounds  because  they  are  equal  to  the 
split/concatenate  cost  plus  an  0{occ)  cost  due  to  the  output.  Furthermore,  we 
make  the  bounds  for  insertions  and  deletions  of  a  single  item  worst-case  rather 
than  amortized.  The  new  data  structures  work  for  many  other  order  decompos¬ 
able  problems  under  split  and  concatenate  operations.  For  example,  point  inser¬ 
tions  and  deletions  in  a  planar  Voronoi  diagram  of  p  points  take  0{p)  time  in 
0(ploglogp)  space  [21]  (a  result  in  [1]  is  a  semi-dynamic  algorithm  with  0{p) 
deletion  time  and  space).  We  obtain  an  0{p)  cost  also  for  range,  split  and  con¬ 
catenate  operations  in  O(ploglogp)  space  (the  techniques  in  [13,  28,  29]  require 
more  time  or  space).  This  solves  a  problem  posed  in  [1]  (i.e.,  compute  the  Voronoi 
diagram  for  any  given  subset  R  C  S  of  points  in  less  than  0{plogp)  time)  for  the 
special  case  in  which  R  is  defined  by  range  queries  on  a  dynamic  set  S.  Our  tech¬ 
nique  for  order  decomposable  problems  is  suitable  for  efficient  external  memory 
algorithms.  For  the  case  d  =  1,  B-trees  [4]  are  very  popular  data  structures 
that  can  be  successfully  employed  in  decomposable  search  problems  analogously 
to  concatenable  2-3-trees.  For  d  >  1,  no  provably  good  external  memory  data 
structures  for  splitting  and  concatenating  along  any  dimension  were  previously 
known  in  the  literature.  In  this  extended  abstract,  many  details  are  omitted  for 
lack  of  space. 

2  Splitting  and  Merging  Data  Structures 

In  this  section,  we  describe  how  to  maintain  d  =  2  total  orders,  which  we  denote 
by  -<x  and  -<y,  under  split  and  concatenate  operations.  Let  p  be  the  number  of 
items  in  5.  Each  item  z  e  S  can  be  associated  with  a  dynamic  point  {X{z),  Y (;2:)) 
in  the  Cartesian  plane,  such  that  X (z)  is  the  rank  of  z  in  5  with  respect  to  current 
order  -<x  and  Y (z)  is  the  rank  of  z  in  5  with  respect  to  current  order  -<y .  Starting 
from  p  items  in  5,  we  obtain  p  points  in  the  Cartesian  plane,  which  can  be  stored 
in  the  form  of  a  p  x  p  sparse  and  dynamic  matrix  M. 

The  operations  in  S  can  be  simulated  by  a  certain  number  of  operations  in 
M,  Operation  split{S^z,  ^x)  corresponds  to  splitting  matrix  M  horizontally  at 
a  certain  position  X(z),  which  is  the  rank  of  z  in  5  with  respect  to  ^x?  while 
doing  the  same  according  to  its  order  ^y  is  equivalent  to  handling  M  vertic¬ 
ally  at  position  Y{z).  Concatenating  is  analogous.  Operations  insert{z,S)  and 
delete{z,  S)  require  a  new  operation  which  sets  entry  M[X{z)^Y{z)]  to  item  z  or 
to  an  empty  value,  respectively.  Finally,  solving  problem  V  in  the  region  specified 
by  range{{ax,bx)i  (ay , 6y),  5)  can  be  done  by  solving  V  for  the  points  contained 
in  the  rectangular  part  of  M  defined  by  the  ranks  of  ax^bx^avyby  in  their 
corresponding  order.  We  can  state  our  multiordered  set  splitting  and  merging 
problem  by  using  our  sparse  matrix  M. .  Formally,  for  any  integers  hi,h2,viy  V2 
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hi  <  I  <  vi  <  V2  <  p),  we  use  M[hi^h2\vi^V2]  to  denote  the  sub¬ 

matrix  of  M  that  contains  entries  M[i,j]  with  hi  <i  <h2  and  vi  <j  <V2.  We 
call  this  submatrix  a  region.  We  can  disassemble  and  reassemble  a  single  matrix 
M  in  many  different  ways  by  using  any  sequence  of  the  following  operations: 

h-Split{MJ):  Split  M  horizontally  at  row  i  and  obtain  two  new  matrices  Mi 
and  M2,  such  that  Mi  =  M[l,i',  l,p]  and  M2  =  M[i  +  l,p;  l,p].  In  other 
words,  Ml  is  given  by  the  first  i  rows  of  M  and  A^2  is  given  by  the  last 
{p  —  i)  rows  oi  M.  M  is  no  longer  available  after  the  operation. 
h-concatenate{Mi ,  M2)‘  Let  Mi  have  size  mxp  and  M2  have  size  nxp.  We 
meld  Ml  and  M2  horizontally  and  produce  a  matrix  M  of  size  (m  +  n)  xp, 
such  that  l,p]  =  Mi  and  M[m  -\-  l,m-hn;  l,p]  =  In  other 

words,  the  first  m  rows  of  M  are  given  by  Af  1  and  the  last  n  rows  of  M 
are  given  by  Al2-  This  operation  assumes  that  Mi  and  M2  have  the  same 
number  of  columns.  Mi  and  M2  are  no  longer  available  after  the  operation. 
set[i^j^w^M):  Update  M  by  setting  M[i,j]  =  w.  This  corresponds  either  to 
an  insertion  (if  w  is  nonempty)  or  to  a  deletion  (if  w  is  empty). 
range(hi,h2,vi,V2,M):  Find  the  solution  V(R}  to  problem  V  restricted  to 
the  nonempty  entries  contained  in  region  R  =  Mlhi,  h2]  vi,V2]^ 

Operations  V-Concatenate{Mii  M2)  and  v-split{M,j)  are  similarly  defined.  We 
restrict  ourselves  to  the  special  case  where  each  row  or  column  of  M  contains  a 
constant  number  of  points  but  our  technique  works  for  a  general  matrix  M.  We 
need  some  preliminary  definitions.  Let  X  =  {xi^X2, . . . ,  be  a  sorted  sequence 
of  q  elements,  according  to  a  total  order  rri  -<  X2  ^  Xg.  Let  /i , . . . ,  /s  be 
a  partition  of  X  into  adjacent  intervals,  so  that  for  1  <  i  <  s  —  1  all  the  elements 
in  li  precedes  all  the  elements  in  R+i.  For  1  <  z  <  5,  let  |/i|  denote  the  size  of 
interval  7^,  defined  as  the  number  of  elements  in  R. 

Definition!.  (Size  Invariant)  Let  A:  >  1  be  a  positive  integer.  The  adjacent 
intervals  /i , . . . ,  /«  satisfy  the  size  invariant  of  order  k  if  the  following  two  con¬ 
ditions  are  met:  (a)  |7il  ^  1  ^  ^  ^  (L)  |Li|  +  |7i+i|  >h,  l<i<s  —  1. 

The  size  invariant  of  order  k  in  Definition  1  implies  that  the  number  s  of  intervals 
is  0{q/k).  Moreover,  the  size  invariant  can  be  maintained  in  0(logA:)  time  when 
an  element  is  deleted  from  X  or  a  new  element  is  inserted  into  X. 

We  now  introduce  the  cross-tree,  which  is  a  2-dimensional  data  structure 
supporting  efficient  split  and  concatenate  operations.  Intuitively,  a  cross-tree  de¬ 
scribes  a  balanced  decomposition  of  a  2-dimensional  set,  and  it  is  based  upon 
a  variant  of  2-3-tree  [2],  which  we  call  1 -2-tree.  A  1-2-tree  satisfies  two  con¬ 
ditions:  (a)  All  the  leaves  are  on  the  same  level  and  each  internal  node  has  at 
most  two  children,  (b)  The  children  of  all  the  internal  nodes  on  the  same  level 
satisfy  the  size  invariant  of  order  2  according  to  Definition  1.  It  follows  that  no 
two  adjacent  nodes  can  have  a  single  child.  It  can  be  shown  that  1-2-trees  are 
balanced  and  that  a  1-2-tree  with  n  leaves  can  be  modified  by  means  of  split, 
concatenate,  insert  and  delete  operations  in  O(logn)  time  per  operation,  with 
each  operation  involving  at  most  0(1)  nodes  and  parent  pointers  per  level. 
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Definition  2.  (Cross-Tree)  Let  T  and  S  be  two  1-2-trees,  having  the  same 
height.  The  cross-tree  CT{T  x  S)  is  the  cross  product  of  T  and  S  defined  as 
follows.  For  each  node  u  in  T,  there  is  a  node  auv  in  CT{T  x  S)  for  every  node  v 
in  S  on  the  same  level  as  u.  For  each  edge  {u,  u)  in  T,  there  is  an  edge  (a  UV  j  ^uv'} 
in  CT{T  X  S)  for  every  edge  (v,  v)  in  S,  such  that  u  and  v  are  on  the  same  level. 

A  cross-tree  has  either  1,  2  or  4  children  and  it  is  balanced  (i.e.,  its  height  is 
logarithmic  with  respect  to  the  number  of  its  leaves).  We  can  update  a  cross-tree 
C'r(r  X  S)  by  modifying  either  T  or  5  (i.e.,  we  can  split,  concatenate,  insert  or 
delete  in  one  of  the  1-2-trees)  and  obtain  the  corresponding  cross-tree  efficiently. 
We  can  show: 

Theorems.  We  can  split  a  1-2-tree  T  into  Ti  and  T2  in  order  to  obtain  cross- 
trees  CT{Ti  X  S)  and  CTiT^  x  S)  from  cross-tree  CT(T  x  S)  m  0(|51)  time. 
We  can  concatenate  1-2-trees  Ti  and  T2  into  T  to  obtain  CT{T  x  S)  from 
CT{Ti  X  S)  and  CT{T2  x  5)  in  0(|5|)  time. 

2.1  The  General  Technique 

We  now  treat  our  splitting  and  merging  problem  for  a  matrix  M.  We  refer  to 
the  p  nonempty  entries  of  M  as  the  points  of  M  and  let  A;  be  a  slack  parameter, 
where  k  is  an  integer  with  1  <  /?  <  p.  We  handle  the  sparse  pxp  matrix  M  as 
if  it  were  a  dense  0{p/k  +  k)x  0{p/k  +  k)  matrix.  We  then  tune  k  according  to 
the  chosen  problem  V  and  the  cost  f{p)  of  operator  We  proceed  as  follows. 
We  group  adjacent  rows  and  columns  of  matrix  M  into  respectively  horizontal 
and  vertical  stripes,  such  that  the  stripes  satisfy  the  size  invariant  of  order  k 
(Definition  1),  where  the  size  of  a  horizontal  (respectively  vertical)  stripe  is  given 
by  its  number  of  rows  (respectively  columns).  The  size  invariant  guarantees  that 
each  stripe  contains  at  most  0{k)  points  and  that  the  total  number  of  horizontal 
and  vertical  stripes  is  0{p/k).  The  partition  into  horizontal  and  vertical  stripes 
induces  a  partition  of  M  into  0(p‘^lk^)  squares,  such  that  each  square  intersects 
no  more  than  k  rows  and  k  columns.  We  call  these  the  basic  squares  in  A4.  We 
maintain  the  solution  to  V  for  each  such  basic  square  and  store  these  solutions  in 
the  leaves  of  a  cross-tree  CT(Th  x  Tv),  which  describes  recursively  the  partition 
of  M  into  its  basic  squares.  For  this  purpose,  we  employ  two  1-2-trees,  denoted 
by  Th  and  Ty,  whose  leaves  are  in  one-to-one  correspondence  to  the  horizontal 
and  vertical  stripes,  respectively.  Trees  Th  and  Ty  have  0{p/k)  leaves,  one  for 
each  stripe  of  Ad,  and  a  total  of  0{p/k)  nodes.  Consequently,  cross-tree  CT{Th  x 
Ty)  has  height  0{\og{p/k))  and  0{p‘^/k‘^)  leaves,  one  for  each  basic  square  of 
M,  and  a  total  of  Oip^/k^)  nodes.  Its  leaves  corresponding  to  the  nonempty 
basic  squares  in  either  a  horizontal  or  vertical  stripe  can  be  retrieved  in  0{p/k) 
time,  and  the  points  in  the  stripe  can  be  retrieved  in  additional  0{k)  time.  We 
then  percolate  the  solutions  from  the  leaves  of  the  cross-tree  towards  its  internal 
nodes  in  a  heap-like  fashion  by  means  of  operator  0-  If  tbe  solutions  occupy 
more  than  0{f{p))  space,  we  save  space  whenever  0  is  invertible:  We  say  that 
0  is  invertible  if  we  can  keep  0(f(p))  additional  information  associated  with 
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any  solution  V{R)  =  so  that  we  can  compute  ^~^{V{R))  — 

{V{R’)^V{R")}  in  0{f{p))  time.  For  example,  if  V  is  the  range  query  problem 
and  0  is  the  destructive  list  append  with  cost  /(p)  =  0(1),  we  can  simply  keep  a 
pointer  to  the  last  item  in  the  appended  lists  to  “de-append”  them  in  0(1)  time. 

Our  data  structure  has  the  following  additional  features.  For  each  nonempty 
basic  square  of  Al,  we  keep  its  points  sorted  according  to  a  total  order  -<p  (not 
necessarily  equal  to  -<x  or  -^y)  by  means  of  a  threaded  binary  search  tree,  whose 
nodes  are  linked  together  in  symmetrical  order.  Searching,  inserting  and  deleting 
a  point  takes  0(logA;)  time.  Scanning  the  points  in  a  basic  square  in  their  -<p- 
order  takes  constant  time  per  scanned  point.  We  introduce  order  -<p  because 
some  data  structures  can  be  built  more  efficiently  on  a  sorted  set  of  points. 
Each  node  in  cross-tree  CT(Th  x  Ty)  corresponds  to  a  region  R  of  matrix  M. 
The  cross-tree  leaves  correspond  to  the  basic  squares  (leaves  corresponding  to 
the  empty  basic  squares  can  be  ignored).  An  internal  node  p  corresponds  to 
region  R  =  /i2; ^'1,^2]  and  has  no  more  than  four  children  pi,  p2,  ps, 

and  p4  corresponding  to  four  subregions  of  R  (if  a  child  pi  is  empty  then  the 
corresponding  subregion  is  empty.)  We  store  the  solutions  to  V  in  the  following 
way.  For  each  nonempty  basic  square  of  Ad,  we  store  the  solution  for  its  points 
in  the  corresponding  cross-tree  leaf.  For  each  internal  node  p  of  the  cross-tree, 
we  use  that  fact  the  V  is  order  decomposable  to  store  0(si, . . . ,  Sj)  in  p,  where 
si , . . . ,  Sj  are  the  solutions  stored  in  its  j  <  4  children.  This  is  indeed  the  solution 
V{R)  for  the  points  in  the  region  R  corresponding  to  p,  and  is  stored  in  an  efficient 
way  depending  on  the  problem  V. 

We  now  show  how  to  use  our  data  structure  for  solving  problem  V.  We  denote 
by  P{k)  the  cost  of  preprocessing  an  0{k)-pomt  stripe  to  solve  problem  V  for 
every  basic  square  in  the  stripe.  We  will  exploit  the  fact  that  the  basic  squares 
are  already  -<p-ordered  to  determine  P{k)  and  we  assume  that  P{k)  >  k  is  a. 
smooth  nondecreasing  function.  Furthermore,  we  use  U{k)  to  denote  the  cost  of 
updating  the  solution  to  problem  V  for  a  basic  square  in  an  0(A:)-point  stripe 
after  its  preprocessing.  We  assume  that  U{k)  >  log  A:,  since  we  have  to  update 
at  least  the  threaded  search  tree  in  the  basic  square.  Finally,  we  denote  by  S{k) 
the  space  occupied  by  an  0(fc)-point  stripe.  We  also  assume  that  S{k)  >  k 
is  a  smooth  nondecreasing  function.  In  most  of  our  applications,  we  will  have 
P(k),S{k)  =  0{k)  and  U{k)  =  0(f(k)\ogk).  In  the  preprocessing,  we  put  (A;  + 
l)/2  rows  (columns)  per  stripe  except  for  the  last  one,  which  has  some  dummy 
rows  (columns)  added,  and  build  the  cross-tree.  This  takes  0{p\ogp  +  P(p)  + 
f{p)  P^/k^)  time.  We  now  describe  in  some  details  how  to  perform  a  v-split{M^j). 
Column  j  might  fall  inside  a  vertical  stripe  cr,  which  must  necessarily  be  split. 
We  examine  the  basic  squares  of  a.  Given  a  basic  square,  we  scan  its  points 
according  to  their  ^p-order  and  produce  two  -<p -ordered  lists  in  linear  time: 
one  list  contains  all  the  points  whose  second  coordinate  is  smaller  than  or  equal 
to  j  and  the  other  list  contains  the  remaining  points,  i.e.,  the  points  whose  second 
coordinate  is  larger  than  j.  We  split  this  basic  square  into  two  squares  and  build 
two  threaded  search  trees  for  them  in  linear  time  by  using  the  two  Xp-ordered 
lists.  Since  each  stripe  consists  of  0(pfk)  basic  squares  and  contains  0{k)  points. 


612 


we  can  examine  stripe  a  square  by  square  in  0{k^plk)  time  and  split  it  into  new 
stripes  cri  and  <J2,  such  that  ui  contains  all  the  points  of  a  before  and  including 
column  j,  and  a2  contains  all  the  points  of  a  after  column  j.  This  creates  0{p/k) 
smaller  squares  and  costs  0{k  +p/k)  time.  We  check  to  see  if  we  can  combine  ai 
and  a2  with  their  neighbor  stripes  to  maintain  the  size  invariant  of  order  k.  For 
any  two  such  stripes  to  be  merged,  we  examine  their  basic  squares  in  pairs  (a 
square  per  stripe),  such  that  the  two  squares  are  on  the  same  horizontal  stripe. 
We  take  their  two  p-ordered  lists  of  points  and  merge  them  to  build  a  threaded 
search  tree  on  the  resulting  list  in  linear  time.  Again,  this  requires  0{k  +p/k) 
total  time.  It  is  worth  noting  that  splitting  and  merging  stripes  preserves  the 
order  of  their  presorted  points.  Next,  we  determine  the  solutions  for  the  basic 
squares  in  the  0(1)  stripes  involved  at  a  total  cost  of  P(k)  time.  It  remains 
to  split  cross-tree  CT{Th  'xTy)  to  reflect  the  split  operation  on  the  vertical 
stripes.  We  first  focus  on  the  cross-tree  topology  and  discuss  later  on  how  to 
maintain  the  solutions  to  V  in  its  nodes.  We  have  to  split  the  l“2~tree  Ty  at 
the  leaf  w  corresponding  to  stripe  cr.  We  split  w  into  two  new  leaves  and  W2, 
corresponding  to  the  split  of  a  into  the  new  stripes  ai  and  0*2.  If  cri  or  <72  are 
combined  with  their  neighbor  stripes,  we  should  do  the  same  on  wi  and  W2  and 
their  neighbor  leaves.  We  check  to  see  if  the  1-2-tree  Ty  satisfies  the  size  invariant 
of  order  2  along  a  leaf-to-root  path  and  update  the  corresponding  0{p/k)  cross¬ 
tree  leaves.  Globally,  we  create  no  more  than  0{p/k)  leaves  corresponding  to  the 
new  basic  squares  in  0(1)  stripes  and  we  traverse  and  reorganize  their  ancestor 
nodes  all  the  way  up  to  the  cross-tree  root  by  Theorem  3  (with  T  =  Th  and 
S  =  Ty).  Consequently,  maintaining  the  cross-tree  topology  takes  0{p/k)  time. 
Next,  we  recompute  the  solutions  to  V  in  the  traversed  cross-tree  nodes  by 
applying  operator  0  to  them  upwards,  in  0(/(p))  time  per  node  (we  show  in  the 
full  paper  how  to  do  this  with  0“^  if  0  is  invertible).  Since  we  traverse  a  total  of 
0(p/k)  nodes,  it  takes  0(/(p)  p/k)  time  to  recompute  their  solutions.  It  therefore 
takes  a  total  of  0{k  +  f  {p)  p/k  +  P{k))  =  0(f{p)  p/k  +  P{k))  time  to  execute 
v-split,  as  P(k)  >  k.  The  implementation  of  hsplit  is  completely  analogous.  We 
do  not  discuss  here  the  other  operations  due  to  lack  of  space  and  refer  the  reader 
to  the  full  paper.  There,  we  prove  the  following  main  theorem: 

Theorem  4.  The  splitting  and  merging  problem  on  p  points  can  be  solved  with 
the  following  time  bounds  for  a  parameter  k  (I  k  <  p)  and  an  operator  cost 
f{p):  range,  h^split,  vsplit,  h.concatenate,  v.concatenate:  0{{p/k)f{p)  +  P{k)-{‘ 
P{p)/p),  withP{k)  >  k;  set:  0{\og{p/k)f  {p)-\-Uik)  +  (p/k^)f  {p)-{-P{p)/p),  with 
ulk)  >\ogk.  The  space  required  is  0{S{p)  -b  {p^ / k'^)  f  {p))  and  the  preprocessing 
time  is  0{p\ogp  +  P{p)  +  {p'^ / k^) f  {p)) • 

Theorem  4  states  the  bounds  needed  for  solving  a  general  decomposable  prob¬ 
lem  V  in  terms  of  the  parameter  k,  I  <  k  <  p.  In  most  of  our  applications, 
f[p)  =  0(p")  for  a  non-negative  constant  e  <  1  and  the  preprocessing  cost  of 
a  stripe  is  P{k)  =  0{k)  because  we  have  presorted  points.  In  this  case,  since 
U(k)  -  0(/(A:)logA:)  and  S{k)  =  0{k)  [20,  21],  we  can  tune  k  =  \^/p  fip)]: 
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Theorems.  The  splitting  and  merging  problem  on  p  items  can  be  solved  with 
the  following  time  bounds  whenever  the  cost  of  operator  0  is  f(p)  =  0(p^)  for  a 
non-negative  constant  e  <  1 ;  range,  h^split,  v^split,  h-concatenate,  v.concatenate 
in  0  ^\/p  /(p) )?'  set  in  0(log(p)/(p)).  The  space  required  is  0{p)  and  the  pre¬ 
processing  time  is  0{p\ogp). 

The  analysis  in  Theorem  4  is  overly  pessimistic  when  f{p)  =  0{p).  Using 
weighted  balanced  B-trees  [3]  in  place  of  1-2-trees  yields  a  different  analysis 
and  better  bounds: 

Theorem  6.  The  splitting  and  merging  problem  on  p  items  can  be  solved  with  the 
following  time  bounds  when  f(jp)  ~  0{j)):  set,  range,  hsplit,  vsplit,  h-concatenate, 
v-concatenate  in  0{p).  The  space  required  is  0(ploglogp)  and  the  preprocessing 
time  is  0{p\ogp). 


3  Some  Applications 

In  this  section  we  list  few  applications  of  Theorems  4-6.  The  problems  in  Theor¬ 
ems  7-9  are  all  0(l)-order  decomposable;  the  problem  in  Theorem  10  is  O(logp)- 
order  decomposable  while  the  one  in  Theorem  11  is  0(p)-order  decomposable. 
Most  of  the  worst-case  bounds  reported  in  this  section  improve  the  best  previ¬ 
ously  known  bounds  for  the  same  problems  [11,  13].  The  improvement  consists 
of  shaving  a  logarithmic  factor  from  the  previous  bounds  and  of  making  some 
bounds  worst-case  rather  than  amortized.  We  omit  the  details. 

Theorem  7.  A  two-dimensional  priority  queue  for  a  set  ofp  items  can  be  main¬ 
tained  in  the  following  time  bounds:  an  item  insertion  or  deletion  in  O(\ogp); 
a  split  or  concatenate  of  any  order  in  0{y/p);  and  a  minimum-weight  query  in 
a  region  in  0{^yp).  The  space  required  is  0{p)  and  the  preprocessing  time  is 
0{plogp). 

Theorems.  A  two-dimensional  2-3-tree  storing  p  points  can  be  maintained 
with  the  following  bounds:  a  point  insertion  or  deletion  in  O(logp);  a  split  or 
concatenate  along  any  coordinate  in  O(y^);  a  range  search  in  0{^/p  +  occ), 
where  occ  is  the  number  of  points  reported  by  the  search.  The  space  required  is 
0{p)  and  the  preprocessing  time  is  (9(plogp). 

Theorem  9.  .4n  interval  tree  that  stores  n  (overlapping)  intervals  from  the  line 
can  be  maintained  with  the  following  time  bounds:  an  interval  insertion  or  dele¬ 
tion  in  O(logn);  a  stabbing  query  (i.e.,  find  all  the  intervals  containing  a  given 
point)  retrieving  only  the  intervals  whose  lengths  are  between  two  input  values 
El ..  .£2  0{^/n  -H  occ),  where  occ  is  the  number  of  such  intervals;  a  split  or 

concatenate  of  the  intervals  according  to  a  perpendicular  stabbing  line  in  0{y/n). 
The  space  required  is  0{n)  and  the  preprocessing  time  is  0{n\ogn). 


Theorem  10.  The  convex  hull  for  p  points  in  the  Cartesian  plane  can  be  main¬ 
tained  with  the  following  time  bounds:  a  point  insertion  or  deletion  in  O(log^p); 
a  split  or  concatenate  along  one  coordinate  in  0{^/plogp);  a  query  checking  if 
a  point  is  inside  or  outside  the  convex  hull  in  Q(logp);  a  query  reporting  the 
convex  hull  for  the  points  in  any  input  region  in  0{^/plogp  +  h),  where  h  is  the 
output  size.  The  space  required  is  0{p)  and  the  preprocessing  time  is  0{p\ogp). 

Theorem  11.  The  Voronoi  diagram  forp  points  in  the  Cartesian  plane  can  be 
maintained  with  the  following  worst-case  time  bounds:  a  point  insertion  or  dele¬ 
tion:  0{p);  a  split  or  concatenate  along  one  coordinate:  0{p);  a  query  reporting 
the  Voronoi  diagram  for  the  points  in  an  input  region:  0{p).  The  space  required 
is  0{p\og\ogp)  and  the  preprocessing  time  is  0{p\ogp). 

The  above  results  show  that  our  technique  is  a  general  paradigm  on  which  we 
can  cast  many  other  split-and-concatenate  data  structures  in  some  basic  prob¬ 
lems  (e.g.,  member  searching,  predecessor,  ranking),  computational  geometry 
(e.g.,  neighbor  queries,  union  and  intersection  queries),  database  applications 
(e.g.,  partial  match  queries,  range  queries)  and  statistics  (e.g.,  maxima  queries). 
We  refer  the  interested  reader  to  [7,  17,  21]  for  more  decomposable  problems. 
We  only  mention  here  that  our  technique  can  be  extended  to  d  >  2  total  orders 
^1, . . . ,  Xd  and  can  be  efficiently  implemented  in  external  memory.  Details  will 
be  given  in  the  full  paper. 

Acknowledgments.  We  are  indebted  to  Amnon  Nissenzweig  and  to  Giuseppe 
Persiano  for  many  delightful  conversations  at  the  beginning  of  this  research.  We 
are  grateful  to  Lars  Arge  and  to  Paolo  Ferragina  for  helpful  discussions  and  to 
Marc  van  Kreveld  and  Mark  Overmars  for  sending  us  a  copy  of  [13]. 

References 

1.  A.  Aggarwal,  L.  Guibas,  J.  Saxe  and  P.W.  Shor,  A  linear-time  algorithm  for  com¬ 
puting  the  Voronoi  diagram  of  a  convex  polygon.  Discrete  and  Computational  Geo¬ 
metry  4  (1989),  591-604. 

2.  A.V.  Aho,  J.E.  Hopcroft,  and  J.D.  Ullman.  The  Design  and  Analysis  of  Computer 
Algorithms.  Addison-Wesley,  Reading,  MA,  1974. 

3.  L.  Arge  and  J.S.  Vitter,  Optimal  dynamic  interval  management  in  external  memory. 
37th  IEEE  Symp.  on  Foundations  of  Computer  Science  (1996). 

4.  R.  Bayer  and  C.  McCreight,  Organization  and  maintenance  of  large  ordered  in¬ 
dexes.  Acta  Informatica  i,  3  (1972),  173-189. 

5.  J.L.  Bentley,  Multidimensional  binary  search  trees  used  for  associated  searching. 
Comm.  ACM,  19  (1975),  509-517. 

6.  J.L.  Bentley,  Decomposable  Searching  Problems.  Information  Processing  Letters, 
8  (1979),  244-251. 

7.  J.L.  Bentley  and  J.B.  Saxe,  Decomposable  Searching  Problems  I.  Static-to- 
Dynamic  Transformation.  J.  of  Algorithms,  1  (1980),  301-358. 

8.  Y.-J.  Chiang  and  R.  Tamassia,  Dynamic  Algorithms  in  Computational  Geometry, 
Proceedings  of  the  IEEE,  Special  issue  on  Computational  Geometry,  G.  Toussaint, 
ed.,  80  (1992)  1412-1434. 


615 


9.  R.A.  Finkel  and  J.L.  Bentley,  Quad-trees:  a  data  structure  for  retrieval  of  composite 
keys.  Acta  Inform. ^  4  (1974),  1-9. 

10.  LG.  Gowda  and  D.G.  Kirkpatrick,  Exploiting  linear  merging  and  extra  storage  in 
the  maintenance  of  fully  dynamic  geometric  data  structures.  In  Proc.  19th  Allerton 
Conference  on  Communication,  Control  and  Computing  (1980),  1-10. 

11.  M.J.  van  Kreveld  and  M.H.  Overmars,  Divided  k-d  trees,  Algorithmica,  6  (1991), 
840-858. 

12.  M.J.  van  Kreveld  and  M.H.  Overmars,  Union-copy  structures  and  dynamic  segment 
trees,  J.  ACM,  40  (1993),  635-652. 

13.  M.J.  van  Kreveld  and  M.H.  Overmars,  Concatenable  structures  for  decomposable 
problems.  Information  and  Computation,  110  (1994),  130-148. 

14.  J.  van  Leeuwen  and  M.H.  Overmars,  The  art  of  dynamizing.  In  Proc.  10th  Math¬ 
ematical  Foundations  of  Computer  Science,  LNCS,  118  (1981),  121-131. 

15.  J.  van  Leeuwen  and  D.  Wood,  Dynamization  of  decomposable  searching  problems. 
Information  Processing  Letters,  10  (1980),  51-56. 

16.  K.  Mehlhorn,  Lowerbounds  on  the  efficiency  of  transforming  static  data  structures 
into  dynamic  structures.  Mathematical  System  Theory,  15  (1981),  1-16. 

17.  K.  Mehlhorn,  Multi-Dimensional  Searching  and  Computational  Geometry  EATCS 
Monographs  on  Theoretical  Computer  Science,  vol.  3,  Springer-Verlag,  1984. 

18.  H.A.  Maurer  and  T.A.  Ottmann,  Dynamic  solutions  of  decomposable  searching 
problems.  In  Discrete  Structures  and  Algorithms,  U.  Pape  ed.,  Hanser  Verlag, 
Wien,  (1979),  17-24. 

19.  K.  Mehlhorn  and  M.H.  Overmars,  Optimal  dynamization  of  decomposable  search¬ 
ing  problems.  Information  Processing  Letters,  12  (1981),  93-98 

20.  M.  H.  Overmars,  Dynamization  of  order  decomposable  set  problems.  J.  Algorithms, 
2  (1981),  245-260. 

21.  M.H.  Overmars,  The  Design  of  Dynamic  Data  Structures,  LNCS  156,  Springer- 
Verlag,  Berlin/New  York,  1983. 

22.  M.H.  Overmars  and  J.  van  Leeuwen,  Maintenance  of  configurations  in  the  plane. 
Journal  of  Computer  and  System  Sciences,  23  (1981),  166-204. 

23.  M.H.  Overmars  and  J.  van  Leeuwen,  Dynamization  of  decomposable  searching 
problems  yielding  good  worst-case  bounds.  In  Proc.  5th  GI  Conference  on  Theor¬ 
etical  Computer  Science,  LNCS,  104  (1981),  224-233. 

24.  M.H.  Overmars  and  J.  van  Leeuwen,  Some  principles  for  dynamizing  decomposable 
searching  problems.  Information  Processing  Letters,  12  (1981),  49-53. 

25.  M.H.  Overmars  and  J.  van  Leeuwen,  Worst-case  optimal  insertion  and  deletion 
methods  for  decomposable  searching  problems.  Information  Processing  Letters,  12 
(1981),  168-173. 

26.  M.H.  Overmars  and  J.  van  Leeuwen,  Dynamic  Multi-dimensional  data  structures 
based  on  quad-  and  k-d  trees.  Acta  Inform.,  17  (1982),  267-285. 

27.  H.  Samet,  Bibliography  on  quad-trees  and  related  hierarchical  data  structures.  In 
Data  Structures  for  Raster  Graphics,  L.  Kessenaar,  F.  Peters,  and  M.  van  Lierop 
eds.,  Springer-Verlag,  Berlin,  (1986),  181-201. 

28.  H.W.  Scholten  and  M.H.  Overmars,  General  methods  for  adding  range  restrictions 
to  decomposable  searching  problems,  J.  of  Symbolic  Computation,  7  (1989),  1-10. 

29.  D.E.  Willard  and  G.S.  Lueker,  Adding  range  restriction  capability  to  dynamic  data 
structures,  J.  ACM,  32  (1985),  597-617. 


Efficient  Array  Partitioning 


Sanjeev  Khanna^  ,  S.  Muthukrishnan^  and  Steven  Skiena^ 

^  Mathematical  Sciences  Research  Center,  Bell  Laboratories,  Lucent  Technologies, 
700  Mountain  Avenue,  Murray  HiU,  NJ  07974.  sanj eevCresearch. bell-labs .  com 
^  Information  Sciences  Center,  Bell  Laboratories,  Lucent  Technologies,  700  Mountain 
Avenue,  Murray  Hill,  NJ  07974.  muthuCresearch. bell-labs. com 
^  Dept,  of  Computer  Science,  State  University  of  New  York,  Stony  Brook,  NY 
11794-4400.  skienaCcs.sunysb.edu.  This  work  is  partially  supported  by  ONR  award 
400xll6yip01  and  NSF  Grant  CCR-9625669. 


Abstract.  We  consider  the  problem  of  partitioning  an  array  of  n  items 
into  p  intervals  so  that  the  maximum  weight  of  the  intervals  is  minimized. 
The  currently  best  known  bound  for  this  problem  is  0(n+p^''’®)  [HNC92] 
for  any  fixed  e  <  1.  In  this  paper,  we  present  an  algorithm  that  runs  in 
time  0(n  log  n);  this  is  the  fastest  known  algorithm  for  arbitrary  p. 

We  consider  the  natural  generalization  of  this  partitioning  to  two  dimen¬ 
sions,  where  ajinxn  array  of  items  is  to  be  partitioned  into  p^  blocks  by 
partitioning  the  rows  and  columns  into  p  intervals  each  and  considering 
the  blocks  induced  by  this  partition.  The  problem  is  to  find  that  parti¬ 
tion  which  minimizes  the  maximum  weight  among  the  resulting  blocks. 
This  problem  is  known  to  be  NP-hard  [GM96].  Independently,  Charikar 
et.  al.  have  given  a  simple  proof  that  shows  that  the  problem  is  in  fact 
NP-hard  to  approximate  within  a  factor  of  two.  Here  we  provide  a  poly¬ 
nomial  time  algorithm  that  determines  a  solution  at  most  0(l)  times  the 
optimum;  the  previously  best  approximation  ratio  was  0{^/^  [HM96]. 
Both  the  results  above  are  proved  for  the  case  when  the  weight  of  an 
interval  or  block  is  the  sum  of  the  elements  in  it.  These  problems  arise 
in  load  balancing  for  parallel  machines  and  data  partitioning  in  parallel 
languages.  Applications  in  motion  estimation  by  block  matching  in  video 
and  image  compression  give  rise  to  the  dual  problem,  that  of  minimizing 
the  number  of  dividers  p  so  that  the  maximum  weight  of  a  block  is  at 
most  (J.  We  give  an  O(logn)  approximation  algorithm  for  this  problem. 
All  our  results  for  two  dimensional  array  partitioning  extend  to  any 
higher  fixed  dimension. 


1  Introduction 

The  problem  of  partitioning  a  set  of  items  into  roughly  equal  weight  subsets  is 
a  fundamental  one.  We  study  two  dual  versions  of  this,  namely,  (a)  given  H, 
partition  a  given  array  into  at  most  B  blocks  so  as  to  minimize  the  maximum 
weight  of  any  block  in  the  partition,  and  (b)  given  S,  partition  a  given  array 
into  minimum  number  of  blocks  such  that  their  individual  weight  is  no  larger 
than  J.  The  definition  of  the  weight  function  for  a  block,  the  type  of  partitions 
allowed,  the  dimensionality  of  the  arrays,  and  the  relevant  version  depends  upon 


617 


the  application  at  hand.  The  problems  we  consider  arise  in  load  balancing  for 
parallel  processing,  compilers  for  high-performance  parallel  languages,  and  mo¬ 
tion  estimation  in  videos  by  block  matching,  and  hence  have  been  extensively 
researched  in  several  communities.  In  this  paper,  we  present  algorithms  for  these 
problems  which  are  more  efficient  than  the  best  ones  so  far,  and  give  improved 
approximations  over  those  previously  known.  In  what  follows,  we  describe  the 
setting  of  the  problems  (Section  1.1),  and  describe  various  application  scenar¬ 
ios  where  three  such  problems  arise  (Section  1.2).  We  state  our  results  for  such 
problems  in  Section  1.3  and  present  the  technical  details  in  sections  2,  3  and  4. 


1.1  Problems 

We  begin  with  the  one  dimensional  version.  Consider  an  array  A[i  •  •  -  n]  of  non¬ 
negative  numbers,  and  a  weight  function  /  that  maps  intervals  of  A  to  non¬ 
negative  integers.  The  function  /  is  trivially  assumed  to  be  0  on  empty  intervals. 
The  p-partition  of  >1  is  a  division  of  A  into  p  intervals,  that  is,  setting  dividers 
do  0  <  di  <  ^2  <  '  •  •  <  <dp  —  n.  Here  the  ith  interval  is  [di_i  1  •  •  •  di] 
if  dt_i  /  di  and  is  denoted  empty  otherwise.  The  MAX  norm  of  a  partition  is 
maxJJ^  f{A[di-i  -f  1  •  ■  -  di]).  Two  weight  functions  arise  commonly  in  practice: 
the  additive  weight  function  F{A[i,j])  =  YlkZi  Hamming  weight 

function  He  for  a  given  parameter  c,  relative  to  another  array  B  of  size  n,  given 
by  Hc{A[i,j])  =  min- c<k<c'H{B[i  A  kj  k],A[iJ])  where  %{X,Y)  gives  the 
Hamming  distance  between  two  segments  X  and  Y  of  identical  length. 

The  ID  p-partition  problem.  Given  p,  find  the  p-partition  that  minimizes 
the  MAX  norm.  □ 

This  notion  can  be  naturally  extended  to  s.py.p  partitionin  two  dimensions  as 
follows.  Consider  an  n  x  n  array  A.  Divide  the  rows  [1,  n]  into  p  intervals  given  by 
horizontal  dividers  ^  <  hi  <  h2  <•"  <  hp-i  <  hp  =  and  the  columns 

[1,  n]  intop  other  intervals  given  by  the  vertical  dividers  -yo  =  0  <  ui  <  U2  <  •  •  •  < 
Vp-i  <  Vp  =  n.  This  induces  p^  blocks  given  by  A[hi-i-\-l '  •  'hi^  +  1  •  •  -Vj]  for 
each  i,  j.  The  MAX  norm  of  a  partition  is  f{A[hi-i  +  1  •  •  *  fii,  Vj-i  + 

1  •  •  -Uj]).  Again,  the  common  weight  functions  on  blocks  are  F  and  He  defined 
analogously  as  above  for  intervals. 

The  2D  p  X  p-partition  problem.  Given  p,  find  the  p  x  p  partition  that 
minimizes  the  MAX  norm.  □ 

The  2D  J- weight  partition  problem.  Given  <5,  find  the  minimum  p  for  which 
there  exists  a  p  x  p  partition  of  the  array  with  the  MAX  norm  of  at  most  S.  □ 

Remarks.  There  are  many  different  ways  to  partition  2D  arrays,  as  discussed 
in  [GM96,  KRW95,  MS96,  MM+96].  Here  we  consider  only  the  p  x  p  partition. 
These  problems  can  be  naturally  generalized  to  higher  dimensions.  Our  solutions 
for  the  2D  case  extend  to  higher  dimensions  in  a  straightforward  way.  However, 
the  ID  and  2D  cases  are  fundamentally  different,  and  they  will  be  contrasted 
later. 
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1.2  Application  Scenarios 

Array  partitioning  problems  arise  in  load  balancing,  scheduling,  data  layout, 
video  compression,  etc.  We  focus  on  three  specific  array  partitioning  problems. 
Here  we  briefly  describe  the  application  context  for  each;  further  details  of  mod¬ 
eling  will  be  discussed  in  the  journal  version. 

One  dimensional  case  under  F.  This  problem  was  abstracted  for  load  bal¬ 
ancing  in  pipelined,  parallel  environments  in  [B88]  and  studied  in  [OM95,  AF91, 
HL92,  MS95,  M93,  CN91,  HNC92,  N91]  etc. 

Two  dimensional  case  under  F.  This  problem  arises  in  balanced  data  dis¬ 
tribution  as  implemented  in  the  Superb  environment  [ZBG86]  and  HPF2  [HPF] 
(High  Performance  Fortran).  See  [M93,  CM+95]  for  more  applications  to  particle- 
in-cell  computations  and  sparse  matrix  computations. 

Two  dimensional  case  under  Hc<  This  arises  in  motion-compensated  video 
compression  by  block  matching.  Roughly  this  involves  compressing  a  frame  in  a 
video  sequence  by  cutting  it  into  rectangles  each  of  which  is  encoded  in  terms  of 
a  block  in  the  previous  frame.  See  [MM-f  96]  and  then  references  therein  for  the 
precise  setting. 

1.3  Results 

We  state  our  results  for  each  of  the  three  problems  of  our  interest. 

ID  p-partition  under  F,  This  problem  has  been  extensively  researched.  We 
summarize  the  previous  work  and  our  results  in  the  table  below,  providing  all 
citations  where  identical  bounds  were  obtained  independently. 


Reference 

Bound 

Bokhari  [B88] 

Anily  &  Federgruen  [AF91] 

Hansen  &;  Liu  [HL92] 

Manne  &  Sorevik  [MS95] 

Choi  &  Narahari  [CN91] 

Olstad  &  Manne  [OM95] 

Nicol  [N91] 

Charikar,  Chekuri  &  Motwani  [CCM96]: 
Han,  Narahari  &  Choi  [HNC92] 

This  paper 

0{n^p) 

O(n^p) 

0{n^p) 

0{nplogp) 

0{np) 

0{np) 

0{n-\-p^  log^  n) 
0(n  -f  log^  n) 
0(n  +  pi+^),  6<  1 
0(nlog  n) 

Our  result  relies  on  a  binary  search  over  a  space  of  O(n^)  items.  However,  at 
each  test,  an  approximate  median  among  these  items  is  identified  in  only  0(n) 
(as  opposed  to  0{'n?))  time  by  exploiting  the  structure  in  our  search  space.  In 
particular,  we  design  and  use  an  algorithm  that  finds  an  approximate  median  of 
the  O(n^)  elements  which  are  organized  into  n  sorted  lists  in  only  0(n)  time. 

Throughout  we  have  made  no  assumptions  on  the  range  of  F’s.  However, 
improved  bounds  may  be  obtained  if  the  F’s  lie  in  a  restricted  range;  we  omit 
the  details  here. 
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The  2D  (^-weight  problem  under  Hc>  A  number  of  algorithms  are  known  for 
block  matching,  and  in  particular,  for  the  2D  weight  problem  under  He-  These 
essentially  work  by  splitting  subareas  greedily  until  each  subarea  has  weight  at 
most  S  and  do  not  provide  any  guarantees  on  the  number  of  blocks  used.  Building 
on  the  result  of  Grigni  and  Manne  [GM96],  this  problem  can  in  fact  be  shown 
to  be  NP-hard. 

Here  we  provide  an  O(logn)  approximate  polynomial  time  algorithm.  We 
obtain  our  result  by  a  rather  simple  reduction  to  the  classical  set  cover  problem. 
Our  algorithm  works  for  a  general  class  of  metrics  including  F  and  He- 

The  2D  pxp  partition  problem  under  F.  Grigni  and  Manne  [GM96]  showed 
that  the  2D  problem  is  NP-hard  even  when  the  given  array  consists  of  0/1  entries. 
Independently,  Charikar  et.  al.  have  given  a  simple  proof  that  this  problem  is 
APX-hard,  that  is,  the  problem  is  in  fact  NP-hard  to  approximate  within  a 
factor  of  two.  While  a  number  of  natural  heuristic  algorithms  are  known  for 
this  problem  (See  for  example  [MS96]),  most  of  them  can  be  shown  to  be  bad 
(typically  0{^))  approximations.  One  such  heuristic  has  been  recently  shown 
to  have  a  performance  guarantee  of  0{^yp)  by  Halldorsson  &  Manne  [HM96]. 
This  is  the  currently  best  known  approximation  for  this  problem. 


Reference 

Result 

Grigni  &  Manne  [GM96] 
Charikar  et.  al.  [CCM96] 
Halldorsson  &  Manne  [HM96] 
This  paper  (Section  4) 

NP-Hardness 
APX-Hardness 
0{^/p)  approximation 
0(1)  approximation 

We  observe  that  using  our  result  for  the  2D  (J-weight  problem  above,  one 
can  easily  obtain  an  O((logn)^)  approximation  algorithm  for  the  2D  p  x  p- 
partitioning  problem  under  F.  But  our  main  contribution  is  an  0(l)-factor  ap¬ 
proximation  for  this  problem  which  builds  on  an  inherent  connection  between 
“independent”  rectangles  of  large  weight  within  the  array  and  the  cost  of  the 
optimal  solution.  Surprisingly,  we  are  able  to  show  that  after  a  suitable  prepro¬ 
cessing  of  the  input  array,  a  locally  optimal  collection  of  independent  rectangles 
can  be  used  to  generate  a  solution  which  is  at  most  a  constant  factor  away  from 
the  optimal. 

2  The  One  Dimensional  Case  Under  F 

We  assume  for  convenience  that  F(A[i])  /  0  for  any  i]  this  assumption  can  be 
easily  removed  and  we  omit  that  detail.  Define  the  Boolean  function  M^(^,  A;,  -u) 
to  be  true  if  and  only  if  there  exists  a  partition  of  the  elements  A[i^  n\  into  k 
intervals,  such  that  the  MAX  norm  of  these  intervals  is  <  v.  In  our  analysis 
below,  we  count  only  the  complexity  of  calls  to  the  F  oracle;  F  can  be  simulated 
in  constant  time  after  linear  preprocessing. 

Lemma  1.  MA{l,k,v)  can  be  determined  using  0{n)  calls  to  the  F  oracle  for 
arbitrary  k,  I,  and  v. 
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Proof.  Note  that  without  loss  of  generality  the  [j  +  l)st  divider  can  be  placed  as 
far  to  the  right  of  the  jth  divider  such  that  F  value  of  the  elements  in  that  interval 
is  <  V.  By  incrementally  inserting  dividers  from  left  to  right  so  as  to  prevent 
the  total  in  any  interval  from  exceeding  v,  we  find  the  minimum  number  of 
dividers  required  in  0{n)  time.  If  this  total  exceeds  k,  then  MA{l,k,v)  =  false. 
Otherwise,  Ma(Ii  k^  v)  —  true.  ^ 

In  the  optimal  partitioning  with  k  dividers,  there  will  be  an  interval  i] 
which  will  prove  the  bottleneck  of  the  partitioning:  an  interval  is  a  bottleneck  to 
the  partitioning  if  it  is  the  largest  weight  interval  that  results  from  this  partition¬ 
ing.  There  are  Q)  candidates  for  this  bottleneck  interval.  Performing  a  binary 
search  on  these  candidates,  using  the  linear-time  oracle  of  Lemma  1,  would  yield 
an  a(nlogn)  algorithm  to  search  for  the  ^-partition.  However,  this  requires  a 
method  to  efficiently  compute  the  sequence  of  (approximate)  median  candidates 
to  support  the  binary  search.  Conventional  linear-time  median-finding  is  clearly 
inadequate,  since  we  have  only  0(n)  time  to  find  the  median  of  0{v^)  elernents. 

We  take  advantage  of  the  fact  that  this  collection  of  0(n^)  elements  is  not 
arbitrary,  but  has  rather  been  derived  from  interval  sums  over  n  elements.  We 
partition  the  (”)  intervals  f{A[i,j])  into  n  columns,  where  column  c  consists  of 
the  elements  f{A[i,c]),  1  <  i  <  c.  Let  =  f{A[i,c\)  denote  the  ith  element 

of  column  c.  The  subcolumn  Sc[i^j]  comprises  elements  (7c[aj]>  i  ^  x  K  j .  These 
definitions  are  illustrated  in  Figure  1(a). 

Lemma  2.  Cc[i]  >  Cc[j]  Further,  the  median  element  of  any  subcolumn 

Sc[i,j]  can  be  determined  with  one  call  to  the  F  oracle. 

Proof.  The  first  claim  follows  since  the  elements  of  each  column  are  monoton- 
ically  non-increasing.  The  second  claim  follows  since  the  median  of  Sc[hj]  is 
F(A[[(i  +  i)/2J,c]).  ° 

Theorems.  The  ID  p-partition  problem  under  F  can  be  solved  in  O(nlogn) 
time. 

Proof.  As  per  the  above  discussion,  we  effectively  perform  a  binary  search  over 
the  set  of  (”)  interval  values.  For  each  column,  we  will  maintain  one  subcolumn 
containing  the  range  of  intervals  which  might  include  the  optimum.  Let  U  be 
the  set  of  elements  representing  the  union  of  the  elements  in  all  the  active  sub¬ 
columns.  A  splitter  ioT  U  is  an  element  m  such  that  the  rank  of  m  in  f/,  say  r^, 
satisfies 

\U\lc<r^<\U\{c-l)lc 

for  some  constant  c.  The  following  algorithm  finds  a  splitter  for  the  active  sub¬ 
columns  in  0(n)  time: 

1.  We  find  m,  the  median  element  of  the  set  of  <  n  median  elements  of  the 
active  subcolumns.  Using  Lemma  2,  this  set  of  median  elements  can  be  iden¬ 
tified  in  0{n)  time.  The  median  of  this  collection,  m,  can  now  be  identified 
in  0(n)  time  using  the  standard  linear-time  median  finding  algorithm. 


621 


Fig.  1.  (a)  Columns  and  subcolumns  of  A.  (b)  The  median  of  medians  is  not  necessarily 
a  splitter. 


2.  We  divide  the  active  subcolumns  into  two  sets  according  to  whether  their 
median  is  <  m  or  not.  Let  Ci  [Cr)  denote  the  set  of  elements  in  subcolumns 
whose  medians  are  <  m  (>  m).  If  min(|Ci|,  iCrl)  >  (|(7/|  +  iCy-D/S,  we  return 
m  as  a  good  splitter. 

3.  As  illustrated  by  Figure  1(b),  this  median  of  medians  is  not  necessarily  a 
good  splitter.  If  not,  we  recur  on  the  appropriate  set  of  subcolumns  (the  ones 
containing  the  larger  number  of  elements)  for  the  splitter  search.  Because 
the  set  of  subcolumns  under  consideration  is  halved  on  each  iteration,  the 
total  search  time  remains  linear. 

If  M^(l,p,  m)  =  true,  then  m  is  a  lower  bound  on  the  optimal  partitioning. 
Half  of  the  elements  in  each  subcolumns  in  Ci  may  be  eliminated,  by  replac¬ 
ing  subcolumn  Cdhj]  with  Ccl[(i  +  j)/2\,  j].  If  M^(l,p,  m)  =  false,  then  m 
is  an  upper  bound  on  the  optimal  partitioning.  Half  of  the  elements  in  each 
subcolumns  in  Cr  may  be  eliminated,  by  replacing  subcolumn  Cc[hj]  with 
Cc[i,  [(i  +  j)/2\].  In  either  case,  a  constant  fraction  of  the  elements  are  elim¬ 
inated  in  each  linear-time  round,  and  hence  the  optimal  partition  is  identified  in 
0(n  log  n)  time.  □ 

3  2D  <S-weight  partition  under  He 

We  begin  by  considering  the  following  geometric  problem.  We  say  that  a  rect¬ 
angle  is  stabbed  by  a  line  if  the  line  passes  through  the  interior  of  the  rectangle. 

Stabbing  Problem.  Given  a  set  of  axis-parallel  rectangles  in  the  [1,  n]  x  [1,  n] 
two  dimensional  integer  grid,  determine  a  set  R  of  grid  rows  and  C  of  grid 
columns  such  that  each  rectangle  is  stabbed  by  one  of  the  rows  in  R  or  one  of 
the  columns  in  C  and  furthermore,  s  —  max{|ii|,  jCl}  is  minimized.  □ 

Lemma  4.  The  stabbing  problem  is  O {log  n)-approximable. 


Proof.  The  proof  is  by  reduction  to  set  cover;  the  details  are  deferred  to  the  final 
version.  □ 
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Theorems.  There  exists  a  polynomial  time  O(logn)  factor  approximation  for 
the  2D  5 -weight  partition  problem  under  He- 

Proof.  We  reduce  this  problem  to  the  stabbing  problem  above.  Consider  the 
collection  of  all  possibly  overlapping  minimal  rectangles  where  the  F  value  of 
each  rectangle  is  >  ^;  rectangles  are  minimal  in  the  sense  that  if  two  rectangles 
have  F  value  >  S  and  one  is  contained  in  the  other,  we  retain  the  smaller  one. 
Now  the  2D  J-weight  partition  problem  is  precisely  the  stabbing  problem  for 
which  a  O(logTi)  factor  approximation  exists.  ^ 

4  2D  p  X  p-partition  under  F 

Grigni  and  Manne  [GM96]  have  shown  that  the  2D  p x  p-partition  problem  under 
F  is  NP-Complete.  Charikar  et  al  [CCM96]  proved  that  it  is  NP-Complete  even 
to  approximate  the  solution  within  a  factor  of  2.  In  this  section,  we  present  a 
polynomial  time  heuristic  which  provides  an  0(1)  factor  approximation. 

The  following  lemma  is  crucially  used  in  our  arguments. 

Lemma  6.  Let  c  and  d  be  two  positive  integers,  c,d<  k.  If  there  exists  a  k  X  k 
partitioning  such  that  MAX  norm  of  the  blocks  is  B  under  F,  then  there  exists 
a  k/ c  X  k/d  partitioning  with  MAX  norm  <  cdB  under  F. 

Proof.  Consider  a  k  x  k  partitioning  with  MAX  norm  B  and  take  every  cth  row 
as  well  as  every  dth  column.  The  maximum  F  value  of  a  block  of  this  k/c  x  k/d 
partitioning  is  at  most  cdB  since  each  new  block  contains  cd  of  the  previous 
blocks.  ^ 

This  lemma  can  be  combined  with  the  observation  that  Theorem  5  holds  for 
2D  ^-weight  partition  problem  under  F  as  well,  to  get  the  following. 

Theorem  7.  There  exists  a  polynomial  time  O {\og^  n)- approximation  for  the 
ID  p  X  p-partition  problem  under  F. 

We  omit  the  proof  in  this  extended  abstract. 

The  main  result  in  this  section  is  a  substantially  improved  approximation 
algorithm;  our  algorithm  computes  an  0(1)  factor  approximation. 

Let  a  {W,  ^)-partition  be  a  ^  x  ^-partition  such  that  the  MAX  norm  of  the 
blocks  is  at  most  W.  We  will  now  show  that  given  an  input  instance  for  which 
a  ( W,  ^)-partition  exists,  we  can  construct  in  polynomial  time  a  {0{W),i)- 
partition.  The  basic  idea  behind  our  algorithm  is  the  notion  of  independent 
rectangles: 

Definitions.  Two  axis-parallel  rectangles  are  said  to  be  independent  if  their 
projections  are  disjoint  along  both  the  a;-axis  and  the  y-axis. 
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Clearly,  no  single  horizontal  or  vertical  line  can  stab  a  pair  of  independent 
rectangles.  So  if  an  array  has  a  {W,  ^)-partition,  then  it  may  contain  at  most 
2i  independent  rectangles  of  weight  strictly  greater  than  W.  As  a  result,  inde¬ 
pendent  rectangles  constitute  a  useful  tool  in  establishing  a  lower  bound  on  the 
optimal  solution  value.  The  algorithm  presented  below  builds  on  this  idea  to 
construct  a  partition  whose  cost  is  0(W). 


4.1  The  Algorithm 

Let  W  be  the  optimal  solution  value.  We  assume  a  knowledge  of  this  value  in 
the  presentation  below  -  this  value  will  be  determined  by  performing  a  binary 
search  over  the  interval  [0,  y  A[z,  j]].  Observe  that  W  >  maxiy  A[z,  j].  Our 
algorithm  constitutes  of  the  following  five  steps: 

Step  1.  We  obtain  an  ^  x  ^  partition  of  the  array  such  that  each  row  or  column 
within  any  block  in  the  partition  has  weight  at  most  2W.  □ 

This  can  done  by  performing  independent  horizontal  and  vertical  scans.  Dur¬ 
ing  the  horizontal  scan,  we  keep  a  running  sum  of  the  weight  of  each  row  since 
the  most  recent  vertical  partition  and  set  down  the  next  vertical  partition  when 
the  weight  of  any  one  of  the  rows  exceeds  W,  Likewise,  we  set  horizontal  parti¬ 
tions  based  on  running  sums  of  the  weights  of  columns  during  the  vertical  scan. 
Since  each  time  a  new  column  (row,  respectively)  is  considered,  the  weight  of 
the  rows  (columns,  respectively)  can  increase  by  at  most  W,  it  follows  that  the 
weight  of  any  row  (column,  respectively)  within  any  block  induced  by  the  ver¬ 
tical  and  horizontal  partitions  does  not  exceed  2W,  Henceforth  we  consider  the 
array  with  this  £  x  ^  partition  which  we  refer  to  as  the  partition  P. 

Step  2.  We  construct  the  set  5  of  all  minimal  rectangles  whose  weight  exceeds 
W  and  which  are  entirely  contained  within  the  blocks  induced  by  the  partition 
from  Step  1.  A  rectangle  is  minimal  if  there  does  not  exist  another  rectangle 
properly  contained  in  it  with  weight  larger  than  W.  □ 

This  can  be  done  by  starting  from  each  location  within  a  block  and  consid¬ 
ering  rectangles  with  their  top  left  corner  at  that  location  in  turn  in  the  order 
of  increasing  sides  until  all  minimal  rectangles  of  weight  strictly  greater  than  W 
are  discovered. 

Step  3.  We  determine  a  local  3-optimal  set  M  C  S  of  independent  rectangles.  M 
is  a  local  3-optimal  set  if  there  does  not  exist  i  E  {1,2,3}  independent  rectangles 
in  S  —  M  which  can  be  added  to  M  by  removing  at  most  (i  —  1)  rectangles  from 
M  without  violating  the  independence  condition.  □ 

Such  a  set  can  be  easily  constructed  in  polynomial  time  by  repeatedly  per¬ 
forming  swaps  which  increase  the  size  of  the  current  independent  collection.  Each 
swap  takes  polynomial  time  and  the  procedure  terminates  in  polynomial  time 
since  any  independent  collection  can  have  at  most  0{n)  rectangles. 

Step  4.  We  now  introduce  another  partition  based  on  M.  For  each  rectangle 
in  M,  we  set  two  straddling  horizontal  and  two  straddling  vertical  partitions  so 
as  to  induce  that  rectangle.  In  all,  this  introduces  at  most  2M  horizontal  and 


624 


2M  vertical  partitions.  The  partition  P  from  Step  1  together  with  this  partition 
induced  by  rectangles  in  M  is  our  new  partition  now.  ^ 

Step  5.  We  now  have  a  partition  of  the  input  array  which  uses  h  <  2M  +  i 
horizontal  lines  and  v  <  2M  +  I  vertical  lines.  To  get  a  ^  x  ^  partition  from 
this,  we  simply  retain  only  every  horizontal  line  and  only  every  \v/l]th. 

vertical  line.  By  Lemma  6,  this  increases  the  maximum  block  weight  by  at  most 
a  factor  of  \h/l]  \v/r\. 

4.2  Analysis:  Approximation  Guarantee  and  Correctness 

We  need  to  establish  two  properties  of  the  above  algorithm:  (a)  given  a  choice  W 
for  which  the  input  array  has  a  {W,  -£)-partition,  the  weight  of  any  block  in  the 
partition  constructed  by  the  above  algorithm  is  0(W),  and  (b)  the  smallest  value 
W  for  which  the  analysis  of  the  algorithm  holds,  identified  via  binary  search, 
is  upper  bounded  by  the  optimum  solution  value.  We  begin  by  establishing  the 
first  property  above;  the  following  lemma  is  central  to  the  analysis  here. 

Lemma  9.  Let  b  be  a  block  contained  in  some  block  of  the  partition  P  constructed 
in  Step  1  above.  Then  if  the  weight  of  block  b  is  at  least  27W,  it  can  be  partitioned 
into  3  independent  rectangles,  each  with  weight  strictly  exceeding  W . 

Proof.  Given  a  block  of  weight  at  least  27W,  we  construct  three  independent 
rectangles  of  weight  exceeding  W  as  follows.  First  we  perform  a  vertical  scan, 
placing  a  horizontal  cut  as  soon  as  the  weight  of  the  slab  seen  thus  far  exceeds 
7W\  we  place  two  horizontal  cuts  in  all.  This  gives  us  three  slabs  each  of  weight 
strictly  greater  than  IW .  Now  we  perform  a  horizontal  scan  from  right  to  left 
placing  the  first  vertical  cut  as  soon  as  one  of  the  horizontal  slabs  exceeds  weight 
W .  Without  loss  of  generality  assume  that  it  is  the  top  slab.  Then  the  top  right 
block  has  weight  greater  than  W  but  does  not  exceed  SW,  and  the  two  lower 
horizontal  slabs  to  the  left  of  that  vertical  cut  have  weight  greater  than  AW  each. 
Now  in  a  similar  manner  we  place  a  second  vertical  cut  to  obtain  two  independent 
blocks  of  weight  exceeding  W  from  these  two  horizontal  slabs.  Thus  we  get  three 
independent  rectangles  of  weight  greater  than  W  each.  □ 

Lemma  10.  The  weight  of  any  block  in  the  partition  constructed  at  the  end  of 
Step  4  is  0{W). 

Proof.  We  begin  by  observing  the  following  easily  verifiable  properties  of  the 
solution:  (a)  each  block  of  the  solution  is  completely  contained  in  some  block  of 
the  partition  P,  and  (b)  given  a  block  b  e  M  and  another  block  b'  ^  M,  their 
projections  on  the  aj-axis  or  the  y-axis  have  either  completely  disjoint  or  have  a 
perfect  overlap. 

Now  consider  a  block  b  in  the  solution;  using  the  preceding  observations,  it 
is  readily  seen  to  fall  into  one  of  the  following  categories:  (1)  the  block  b  belongs 
to  M,  or  (2)  the  block  b  does  not  belong  to  M  but  has  a  perfect  overlap  along 
one  of  the  axes  with  a  block  b'  G  M ,  or  (3)  the  block  b  does  not  belong  to  M  but 
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has  a  perfect  overlap  along  the  a;-axis  with  a  block  b'  E.  M  and  a  perfect  overlap 
along  the  x-axis  with  a  block  b”  £  M. 

In  Case  1,  the  weight  of  b  is  0{W)  since  the  set  S  as  defined  in  Step  2  has 
the  property  that  any  rectangle  r  in  it  has  weight  at  most  ZW .  This  is  because 
otherwise,  we  can  always  remove  either  a  row  or  a  column  (of  weight  at  most 
2W)  from  r  to  obtain  a  rectangle  r'  of  weight  greater  than  W,  contained  in  r, 
which  violates  the  minimality  of  the  rectangles  in  S. 

In  Cases  2  and  3,  each  block  has  weight  at  most  271^;  this  follows  from  an 
application  of  Lemma  9.  We  observe  that  at  most  two  blocks  in  Af ,  say  b  and 
b”,  may  not  be  independent  of  a  block  which  falls  into  these  two  cases.  So  if 
b  has  weight  greater  than  27W,  we  can  replace  b'  and  b^'  with  at  least  three 
independent  rectangles  which  are  constructible  from  b  (and  are  contained  in  S). 
But  this  contradicts  the  local  3-optimality  of  the  collection  M  constructed  in 
Step  3.  Hence  b  must  weight  at  most  271^.  C 

Lemma  11.  The  number  of  rectangles  in  M  is  21  for  any  choice  W  for  which 
there  exists  a  {W^  t) -partition  of  the  input  array. 

Proof.  If  M  had  x  rectangles,  then  each  of  those  rectangles  must  be  stabbed 
in  the  optimal  solution  since  the  optimal  solution  value  is  bounded  by  W  and 
every  rectangle  in  M  has  weight  strictly  greater  than  W .  Stabbing  x  rectangles 
requires  at  least  ic/2  horizontal  or  vertical  partitions  and  hence  x  must  be  at 
most  2i.  ^ 

Lamma  12.  The  weight  of  any  block  in  the  final  solution  returned  in  Step  5  is 
at  most  0{W)  for  any  choice  W  for  which  there  exists  a  {W,  £) -partition  of  the 
input  array. 

Proof.  Lemma  1 1  tells  us  that  the  number  of  horizontal  and  vertical  partitions  at 
the  end  of  Step  4  is  0{t)  each.  This  fact,  along  with  an  application  of  Lemma  6, 
allows  us  to  conclude  that  the  weight  of  every  resulting  block  in  the  ixi  partition 
is  0{W).  ° 

This  completes  the  proof  of  the  first  property  of  our  algorithm  that  it  gives 
a  solution  of  weight  0{W)  whenever  a  (W,  ^)-partition  exists.  To  conclude,  we 
observe  that  the  least  value  W  for  which  the  algorithm  either  fails  to  construct 
the  partition  P  in  Step  1  or  yields  a  collection  M  in  Step  3  with  more  than  2i 
rectangles,  must  exceed  the  optimum.  Thus  the  binary  search  procedure  works 
to  identify  a  suitable  W. 

Theorem  13.  There  exists  a  polynomial  time  algorithm  that  computes  an  0(1)- 
f actor  approximation  to  the  two  dimensional  block  partitioning  problem. 
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Absti*act.  Let  Qk  be  the  class  of  graphs  with  branchwidth  at  most  k. 
In  this  paper  we  prove  that  one  can  construct,  for  any  k,  a  linear  time 
algorithm  that  checks  if  a  graph  belongs  to  Qk  and,  if  so,  outputs  a  branch 
decomposition  of  minimum  width.  Moreover,  we  find  the  obstruction  set 
for  ^3  and,  for  the  same  class,  we  give  a  safe  and  complete  set  of  reduction 
rules.  Our  results  lead  to  a  practical  linea.!  time  algorithm  that  checks  if 
a  graph  has  branchwidth  <  3  and,  if  so,  outputs  a  branch  decomposition 
of  minimum  width. 


1  Introduction 

This  paper  considers  the  problem  to  find  branch  decompositions  of  graphs  with 
small  branchwidth.  The  notion  of  branchwidth  has  a  close  relationship  to  the 
more  well-known  notion  of  treewidth,  a  notion  that  has  come  to  play  a  large 
role  in  many  recent  investigations  in  algorithmic  graph  theory.  (See  Section  2 
for  definitions  of  treewidth  and  branchwidth.)  One  reason  for  the  interest  in  this 
notion  is  that  many  graph  problems  can  be  solved  by  linear  time  algorithms, 
when  the  inputs  are  restricted  to  graphs  with  some  uniform  upper  bound  on 
their  treewidth.  Most  of  these  algorithms  first  try  to  find  a  tree  decomposition 
of  small  width,  and  then  utilise  the  advantages  of  the  tree  structure  of  the 
decomposition. 

The  branchwidth  of  a  graph  differs  from  its  treewidth  by  at  most  a  multiplicative 
constant  factor  (see  Theorem  1.)  As  branchwidth  is  also  reflecting  some  optimal 
tree  structure  arrangement,  it  is  possible  to  have  algorithmic  applications  anal¬ 
ogous  to  those  of  treewidth.  Hence,  instead  of  using  tree  decompositions,  one 
also  can  use  branch  decompositions  as  starting  point  for  the  linear  time  algo¬ 
rithms  for  problems  restricted  to  graphs  with  bounded  treewidth  (and  hence  also 
bounded  branchwidth.)  In  fact,  in  some  cases,  it  appears  that  branchwidth  is 
more  convenient  to  use,  and  seems  to  give  better  constant  factors  in  the  imple¬ 
mentation  of  the  algorithms;  for  instance,  Cook  used  branch  decompositions  as 
an  important  ingredient  in  a  practical  approximation  algorithm  for  the  Travelling 

*  The  secont  author  was  supported  by  the  Training  and  Mobility  of  Researchers  (TMR) 
Program,  (EU  contract  no  ERBFMBICT950198). 
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Salesman  Problem  [11],  and  remarked  that  branch  width  was  the  more  natural 
notion  (instead  of  treewidth)  to  use  for  that  problem  [10]:  where  tree  decompo¬ 
sitions  primarily  are  concerned  with  vertices,  branch  decompositions  deal  more 
with  edges  (in  a  loose  sense.)  We  also  mention  that  the  branchwidth  of  planar 
graphs  can  be  computed  in  polynomial  time  (see  [20]).  As  both  treewidth  and 
branchwidth  are  NP-complete  parameters  (see  [2,  20]),  it  appears  an  interesting 
task  to  find  algorithms  solving  the  following  problems  {k  is  assumed  to  be  a  fixed 
constant). 

nf.(B)  {n^(T)):  Check  if  for  some  input  graph  has  branchwidth  (treewidth) 
<  k. 

n^{B)  (77^ (T)):  Given  a  graph  with  branchwidth  (treewidth)  at  most  k,  output 
a  minimum  width  branch  (tree)  decomposition. 

According  to  the  results  of  Robertson  and  Seymour,  for  any  minor  closed  class 
of  graphs  there  exist  a  finite  set  of  graphs,  its  obstruction  set,  such  that  a  graph 
G  belongs  to  the  class  iff  no  element  of  the  obstruction  set  is  a  minor  of  G.  It  is 
also  known  that  for,  any  k,  the  class  of  graphs  where  treewidth  (or  branchwidth) 
is  bounded  by  a  fixed  k  is  minor  closed  (see  also  Theorem  1).  An  immediate 
consequence  of  this  fact  (using  results  from  Robertson  and  Seymour  and  the 
algorithm  from  [6])  is  the  existence  of  a  linear  time  algorithm  solving  17 ^{B)  or 
77^  (T).  Unfortunately,  in  this  way,  we  only  get  a  non-constructive  proof  of  the 
existence  of  such  an  algorithm,  but  in  order  to  construct  the  algorithm,  we  must 
know  the  corresponding  obstruction  set.  Additionally,  we  would  like  to  have 
an  algorithm  that  does  not  only  decides  on  branchwidth,  but  also  constructs 
corresponding  branch  decompositions. 

Much  research  has  been  done  towards  the  construction  of  linear  time  algorithms 
solving  ni{T)  and  77^  (T).  In  [6],  a  linear  (on  the  size  of  the  input)  time  algo¬ 
rithm  for  treewidth  was  constructed.  As  this  algorithm  appears  to  be  heavily 
exponential  on  k  (and  thus  impractical,  at  least  without  considerably  optimisa¬ 
tions  in  the  implementation) ,  more  practical  algorithms  have  been  presented  for 
small  values  of  k:  (treewidth  1  and  2  [14,  22],  treewidth  3  [4,  12,  14],  treewidth 
4  [18].)  Also,  the  obstruction  sets  for  treewidth  1,2,  and  3  are  known  [5,  19,  22]. 
In  this  paper,  we  find  analogous  results  to  those  of  [4,  5,  6,  19,  12,  14,  19]  for 
the  parameter  of  branchwidth.  Namely,  for  any  fixed  k,  one  can  construct: 

•  A  linear  time  algorithm  that  solves  II^{B)  and  n^{B). 

•  A  parallel  algorithm  that  solves  n^{B)  in  0(lognlog*7i)  time  on  a  EREW 
PRAM  or  O(logn)  time  on  a  CRCW  PRAM  and  needs  0{n)  operations. 

*A  sentence  in  monadic  second  order  logic  expressing  whether  a  graph  has 
branchwidth  at  most  k  or  not. 

•  The  ob-struction  set  of  the  graphs  of  branchwidth  at  most  k. 

As,  (similarly  to  the  case  of  treewidth)  the  algorithms  above  appears  to  be  non- 
practical  we  provide  special  results  for  the  case  where  k  <  $.  More  specifically, 
for  the  class  of  graphs  with  branchwidth  <  3,  we  identify  the  obstruction  set 
and  we  give  a  set  of  safe  and  complete  reduction  rules  enabling  the  construction 
of  a  practical  linear  time  algorithm  that  checks  if  a  graph  has  branchwidth  <  3 
and,  if  so,  outputs  an  minimum  width  branch  decomposition. 
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The  paper  is  organised  as  follows.  In  Section  2  the  basic  definition  and  prelim¬ 
inary  results  are  presented.  In  Section  3  we  give  several  graph  theoretic  results 
on  Qs .  These  results  concern  the  obstruction  set  of  Qs  and  the  identification  of 
a  complete  and  safe  set  of  reduction  rules  for  Qs  leading  to  the  construction 
of  a  practical  linear  time  algorithm  solving  IJHB)  and  In  Section  4  we 

present  a  general  (for  any  fixed  value  of  k)  solution  for  77^(5)  and  11  ^(B). 


2  Definitions  and  Preliminary  Results 

We  consider  undirected  graphs  without  parallel  edges  or  self-loops.  (It  is  easy 
to  extend  the  results  to  graphs  with  parallel  edges  and/or  self-loops.)  Given  a 
graph  G  =  (K,  E)  we  denote  its  vertex  set  V  and  edge  set  E  with  (G)  and  E(G) 
respectively.  For  any  vertex  v  G  V(G},  we  define  as  No(y)  the  set  of  vertices  in 
V(G)  adjacent  with  v.  Also,  given  a  set  5  C  V(G)  we  denote  as  G[5]  the  graph 
induced  by  S.  We  also  denote  as  AV  the  complete  graph  with  r  vertices. 

Given  two  graphs  G,  H  we  say  that  77  is  a  minor  of  G  (denoted  by  77  <  G)  if 
77  can  be  obtained  by  a  series  of  vertex/edge  deletions  and/ or  edge  contractions 
(a  contraction  of  an  edge  {u,  in  G  is  the  operation  that  replaces  u  and  u  by  a 
new  vertex  whose  neighbours  are  the  vertices  that  were  adjacent  to  u  and/or  v). 
Let  C;  be  a  class  of  graphs.  We  say  that  Q  is  closed  under  taking  of  minors  when 
all  minors  of  any  graph  in  Q  belong  also  to  Q.  Robertson  and  Seymour  proved 
(see  e.g.  [16])  that  any  class  of  graphs  Q  contains  a  finite  set  of  minor  minimal 
elements.  We  call  such  a  set  the  obstruction  set  of  Q .  It  follows  that  if  Q  is  closed 
under  taking  of  minors,  then,  for  any  graph  77,  G  G  ^  iff  there  is  no  graph  in 
the  obstruction  set  of  Q  such  that  H  <  G. 

A  tree  decomposition  of  a  graph  G  is  a  pair  ({X^  |  z  G  7},T  =  (7,  A)),  where 
{Xi  I  i  G  7}  is  a  collection  of  subsets  of  V  and  T  is  a  tree,  such  that 

*  U 

iei 

.for  each  edge  {v,  u;}  G  A(G),  there  is  an  z  G  7  such  that  v,  w  E  Xi,  and 
.for  each  v  G  V  the  set  of  nodes  {z  |  u  G  X*}  forms  a  subtree  of  T. 

The  width  of  a  tree  decomposition  ({X^  |  z  G  7},T  =  (7,  A))  equals  maXig/jlX^I  — 
1 } .  The  tree  width  of  a  graph  G  is  the  minimum  width  over  all  tree  decompositions 
of  G. 

A  branch  decomposition  of  a  graph  G  is  a  pair  (T,  r),  where  T  is  a  tree  with 
vertices  of  degree  1  or  3  and  r  is  a  bijection  from  the  set  of  leaves  of  T  to 
A(G).  The  order  of  an  edge  e  in  A  is  the  number  of  vertices  v  G  V{G)  such  that 
there  are  leaves  GG2  in  T  in  different  components  of  T(V(T),  E{T)  -  e)  with 
r(G)  and  r(t2)  both  incident  with  v  (we  also  say;  v  belongs  to  e.)  The  width  of 
(T,  r)  is  the  maximum  order  over  all  edges  of  T,  and  the  branchiuidth  of  G  is  the 
minimum  width  over  all  branch  decompositions  of  G  (in  case  where  1A(G)|  <  1, 
then  we  define  the  branchwidth  to  be  0;  if  |A(G)|  =  0,  then  G  has  no  branch 
decomposition;  if  1A(G)|  =  1,  then  G  has  a  branch  decomposition  consisting  of 
a  tree  with  one  vertex  -  the  width  of  this  branch  decomposition  is  considered  to 
be  0). 
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Instead,  we  can  use  different  types  of  functions  r.  If  r  is  a  surjective  function 
that  maps  every  leaf  of  T  to  an  edge  e  G  E{G),  then  we  have  an  amplified  branch 
decomposition:  for  each  edge  e  G  E(G)  there  exist  at  least  one  leave  v  oiT  with 
t{v)  =  e.  If,  instead,  we  have  a  partial  function  r,  mapping  only  some  leaves  to 
an  edge,  but  that  is  injective  (every  edge  has  a  unique  leaf),  then  we  have  an 
extended  branch  decomposition. 

In  what  follows  we  denote  as  Bk  {Tk)  the  obstruction  set  of  the  graphs  with 
branchwidth  (treewidth)  at  most  k. 

Theorem  1  ([17])  The  following  statements  hold,  (a)  The  class  of  graphs  with 
bounded  branchwidth  is  closed  under  taking  of  minors,  (b)  branchwidth (G)  < 
treewidth(G)  +  1  <  [ | branchwidth (G)J .  (c)  A  graph  has  branchwidth  0  (<  \) 
iff  each  connected  component  contains  at  most  one  edge  (vertex  of  degree  >  2). 

(d)  B.  =  {^4}- 

The  results  from  [17]  give  algorithms  for  nf{B)  and  TIl{B)  for  =  0, 1,2;  for 
instance,  graphs  have  branchwidth  2  if  and  only  if  they  have  treewidth  2,  and  a 
tree  decomposition  of  width  2  can  be  transformed  into  a  branch  decomposition 
of  width  2  in  linear  time.  The  following  lemma  is  easy  to  show. 

Lemma  1  There  exist  an  algorithm  that  given  an  amplified  branch  decompo¬ 
sition  (T,  r)  of  a  graph  G  with  width  <  3,  outputs  a  branch  decomposition  of 
G  ivith  width  <  3,  in  0(\V(T)\)  time.  Moreover,  there  exist  an  algorithm  that 
given  a  branch  decomposition  (T,  r)  of  a  graph  G  with  width  <  3,  outputs  a 
branch  decomposition  of  any  subgraph  of  G  with  width  <  3  in  0(|I/(G)|)  time. 

A  reduction  is  a  triple  {H,S,f),  where  77  is  a  graph  S  C  V{H),S  ^  0 
and  /  :  y(77)  ^  cj  +  1  is  a  labelling  of  vertices  in  H  by  ordinals  (finite  ones 
and  u)),  such  that  \fv  G  S  f(v)  =  0.  We  say  that  a  reduction  R  =  (H,S\f) 
occurs  in  G  if  77  is  a  subgraph  of  G  and  for  any  v  G  V(R)  the  degree  of  v  in 
G[V"(G)  -  1^(77)  U  {u}]  is  at  most  f{v).  The  result  of  applying  7?  on  G  is  the 
graph  arising  from  G  if  we  remove  the  vertices  in  S  and  connect  as  a  clique  in  G 
all  vertices  in  V  (77)  —  5.  Given  a  graph  class  Q,  we  say  that  a  set  77  of  reductions 
is  safe  if,  for  any  R  E  71  and  for  any  G  such  that  R  occurs  in  G,  the  result  of 
applying  72  on  G  is  a  graph  in  Q  if  and  only  if  G  G  Also,  77  is  called  complete 
for  Q ,  if  for  every  non-empty  graph  G  E  Q,  there  is  a  reduction  in  77  occurring  in 
G.  Clearly,  if  a  set  77  of  reduction  rules  is  safe  and  complete  for  a  graph  class  Q, 
then,  for  any  graph  G,  it  holds  that  G  G  77  if  and  only  if  there  exist  a  sequence  of 
reduction  rules  in  77.  that,  when  successively  applied,  can  reduce  G  to  the  empty 
graph. 

We  denote  as  77/<3  the  set  of  reduction  rules  shown  in  Figure  1.  For  any  72  = 
(77,5,/)  G  77f<3,  5  is  represented  by  the  white  cycles  and  the  values  of  /  are 
shown  only  when  they  are  not  uj  and  correspond  to  vertices  not  in  5. 

Theorem  2  ([4,  .12,  15])  77^ <3  is  a  safe  and  complete  set  of  reduction  rules 
for  the  class  of  graphs  with  treewidth  <  3. 

We  call  a  graph  G  chordal  when  it  does  not  contain  any  induced  cycle  of  length 
>  4.  We  call  a  vertex  v  E  V{G)  simplicial  if  G[7VG(‘i^)]  is  a  clique.  Let  k  be  an 
integer.  A  7:- tree  is  a  graph  which  is  defined  recursively  as  follows.  A  clique  with 
k  +  I  vertices  is  a  ^-tree.  Given  a  A;-tree  G  with  n  vertices,  a  k-tiee  with  n  +  1 
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Fig.  1.  The  reduction  rules  for  the  class  of  graphs  with  treewidth  <  3. 


vertices  can  be  constructed  by  making  a  new  vertex  adjacent  to  the  vertices  of 
a  ^-clique  in  G.  A  graph  is  a  partial  Ar-tree  if  either  it  has  at  most  k  vertices  or 
it  is  a.  subgraph  of  a  ^-tree  G  with  the  same  vertex  set  as  G.  /?-Trees  are  choidal 
graphs  with  w(G)  ~  k-\-l  (cj(G)  is  the  size  of  the  maximum  clique  in  a  graph  G). 
It  can  be  easily  proved  that  a  graph  has  treewidth  <  ^  iff  it  is  a  partial  ^^-tree 
(see  e.g.  [21]).  Also,  if  G  is  a  k-tiee,  then  |^(G)|  =  0(k\V{G)\).  A  set  S'  C  V(G) 
is  an  s-t-seporator  in  G  (5,  i  €  V^),  if  s  ^nd  t  belong  to  different  connected 
components  of  G[V  -  5].  S'  is  a  minimal  s-t- separator,  if  it  does  not  contain 
another  .s-Gseparator  as  a  proper  subgraph.  S  is  a  minimal  separator,  if  there 
exist  vertices  s,  t  ^  V  for  which  S'  is  a  minimal  s-Gseparator.  It  is  known  that 
any  minimal  separator  of  a  chordal  graph  induces  a  clique.  We  call  a  graph  G" 
a  triangulation  of  G  if  G'  is  chordal  and  I^(G)  =  V'(G').  We  call  a  triangulation 
of  G  with  a  minimum  number  of  edges  minimal  triangulation. 

Theorem  3  ([9])  Let  G'  be  a  minimal  triangulation  of  a  graph  G.  Then  any 
minimal  separator  in  G'  is  also  a  minimal  separator  in  G. 


Fig.  2.  The  graphs  M\q,  and  Qz 


Theorem  4  ([5,  19])  %  =  { A5,  Me,  Mg,  Mio}  (graphs  and  A/10 

are  shown  in  Figure  2). 

The  following  can  be  proved  using  Theorems  Lb  and  4. 

Lemma  2  The  following  three  statements  hold. 

a.  There  are  no  graphs  in  63  with  treewidth  <  2. 

b.  Qs  €  S3  and  treewidthiQs)  -  3  (graph  Q3  is  shown  in  Figure  2). 

c.  The  set  {K^,  Mq,  Ms}  contains  all  the  graphs  of  83  that  have  treewidth  >  4. 

Let  G  be  a  graph  and  S  C  V[G),  jS|  ==  4.  We  call  S  =  {ui ,  ^3? '^4}  ^  cross 

if  the  sets  Si  =  S  -  {vj,  1  <  i  <  4  are  all  minimal  separators  of  G.  We  also 
define  as  att(G,  S^)  the  set  of  all  the  vertices  of  the  connected  components  of 
G[V{G)  -  S',;]  that  do  not  contain  the  single  vertex  in  S  -  Si.  If  a  graph  does 
not  contain  any  cross  then  we  call  it  crossless. 

Using  Theorem  3  we  can  easily  prove  the  following. 
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Lemma  3  Let  G  be  a  crossless  graph  of  treewidth  at  most  3  and  G'  be  a  m.inim.al 
triangulation  of  G.  Then,  G'  is  a  crossless  chordal  graph  xuith  (jj{G)  <  4. 

Let  G  be  a  3-tree  G,  A  tree  Tq  is  the  clique  tree  of  G  if  each  vertex  in  V [Tq) 
represents  a  4-clique  in  G  and  where  two  vertices  v  =  {vi,V2,v-3„va] — 
{ni,  1/9,  wa,  '^'4}  £  are  connected  by  an  edge  {v,  u}  in  Tq  ilf  |v  fl  u|  =  3, 

i.e.,  they  have  exactly  3  vertices  in  common  (notice  that  each  such  triple  of  ver¬ 
tices  is  a  minimal  separator  of  G).  Given  an  edge  e  =  {v,u}  G  E{Tg)  we  define 
the  separation  set  of  e  as  sep(e)  =  v  H  u. 

We  will  need  the  following  results  which  we  present  without  proof. 

Lemma  4  There  exist  an  algorithm  that  given  a  3-tree  G  constructs  the  clique 
tree  of  G  in  (9(|y(G)|)  time. 

Lemma  5  There  exists  an  algorithm  that  given  a  crossless  chordal  graph  G  with 
uj(G)  <  4,  outputs,  in  0(|K(G')|)  time,  a  crossless  3-tree  G'  such  that  G'  is  a 
subgraph  of  G  where  V  [G)  =  V (G') . 

3  Graphs  with  branchwidth  at  most  3 

In  this  section  we  will  identify  the  set  B3  and  find  a  complete  and  safe  set  of 
reduction  rules  for  the  class  of  graphs  with  branchwidth  <  3.  Our  results  lead  to 
the  construction  of  a  linear  time  algorithm  testing  whether  a  graph  has  branch- 
width  <  3  and,  if  so,  computes  a  branch  decomposition  of  minimum  width. 
According  to  Theorem  l.c,  it  is  trivial  to  check  in  linear  time  if  G  has  branch- 
width  <  1  and,  if  so,  to  construct  a  branch  decomposition  of  minimum  width. 
Also,  from  Theorem  Ld,  we  can  check  in  linear  time  if  a  graph  has  branchwidth 
>  3  or  not.  In  what  follows,  we  examine  the  non  trivial  case  where  the  input  is 
a  graph  with  branchwidth  >  3.  We  omit  the  case  where  we  are  given  a  graph 
with  branchwidth  2  as  it  is  a  very  similar  (and  much  easier)  version  of  the  non 
trivial  case. 

The  following  lemma  defines  the  notion  of  the  labelled  clique  tree  of  a  crossless 
3-tree  (the  proof  is  omitted). 

Lemma  6  Let  Tq  be  the  clique  tree  of  a  crossless  3-tree  G.  Let  also,  for  any 

V  G  V(Tg),  Ev  =  {e  G  E[Tg)  :  v  is  incident  to  e}.  Then,  for  each  v  G  V{Tg), 

|{sep(e)  :  e  G  Ev]\  <  3.  Moreover,  it  is  possible  in  0{n)  time  to  compute 
a  labelling  function  I  :  \E[Tg)\  — >  {1,2,3}  such  that  Vv  G  V{Tg)  Vei,e2  G 
Eyj  (sep(ei)  =  sep(e2)  iff  l{ei)  =  Le.  edges  in  E^^  tvith  the  same  separa¬ 

tion  set  have  the  same  label.  We  call  such  a  clique  tree  3-labelled  and  we  denote 
it  as  (Tg,  /). 

Given  a  labelled  clique  tree  (Tb,  /)  we  define  the  span  degree  of  a  vertex  v  to  be 
equal  to  |{/(e)  :  e  G  L'v}|-  We  also  call  a  leaf  u  of  Tg  that  is  adjacent  to  a  vertex 

V  simple  if  |{e  G  E^  :  1(e)  —  /({u,  v})}|  =  1. 

The  following  can  be  easily  proved  by  induction  on  \V{Tg)\. 

Lemma  7  Let  (Tg,1)  be  a  labelled  tree  with  more  than  3  vertices.  Then  one  of 
the  following  holds:  (i)  There  exist  no  simple  leaves,  (ii)  There  exist  a  simple 
leafu  in  Tg  adjacent  to  a  vertex  \  of  span- degree  2.  (Hi)  There  exist  two  simple 
leaves  ui  and  U2  in  Tg  adjacent  to  a  vertex  v  of  span-degree  3. 
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Using  now  Lemma  1  we  have  a  proof  of  the  following  Lemma  which  provides 
the  basic  algorithm  of  this  section  (the  proof  is  long  and  is  omitted  due  to  space 
limitations). 

Lemma  8  There  exist  a  linear  time  algorithm  that,  given  a  3-labelled  clique  tree 
of  a  crossless  3-tree  G  constructs  a  branch  width  decomposition  of  G  of  width  3. 

Combining  Lemmas  1,  4,  5,  6,  and  8  gives  the  following  result. 

Theorem  5  Any  crossless  chordal  graph  with  ej(G)  <  3  has  a  branch  decom¬ 
position  of  width  3.  Moreover  it  is  possible  to  construct  an  algorithm  that  finds 
such  a  branch  decomposition  in  0(|U(G)|)  time. 

Using  Theorems  1  and  5  and  Lemma  2,  we  can  now  proof  the  following. 

Theorem  6  The  following  two  propositions  hold:  (i)  branchwidth[G)  <  3  <=> 
treewidth(G)  <  Z  A  Qs  i  G  ^  treewidth{G)  <  ^  A  G  is  crossless,  (ii)  The 
obstruction  set  of  the  graphs  of  branchwidth  three,  3s  equals  Mq,  Mg,  Qs^ . 

We  denote  as  7\.b<s  the  set  of  reduction  rules  shown  in  Figure  3. 


(b-i)  null  ^ 

graph 


Fig.  3.  The  reduction  rules  for  the  class  of  graphs  with  branchwidth  <  3. 


Using  Theorems  l.a,  2,  and  6,  we  can  prove  the  following. 

Lemma  9  7lb<3  reduction  rules  for  the  class  of  graphs  with 

hounded  branchwidth. 

Also,  using  the  case  analysis  of  Lemma  7,  we  can  prove  the  following. 

Lemma  10  The  following  two  propositions  hold:  (i)  If  G  is  a  crossless  Z-tree 
then,  there  exists  one  reduction  rule  in  IZb<s  occurring  in  G.  (H)  If  there  exist 
some  reduction  rule  in  I^h<3  occurring  in  a  graph  G,  then,  for  any  subgraph  G 
ofG  where  V(G)  =  U(G')7  there  exist  also  some  rule  in  1lb<s  occurring  in  G' . 

Using  now  Lemmas  3,  5,  9,  and  10,  we  can  proof  the  following. 

Theorem  7  77.i<3  is  a  safe  and  complete  set  of  rules  for  rewriting  graphs  of 
branchwidth<  3. 

Using  now  Theorems  7  and  5,  we  can  prove  the  following. 

Theorem  8  One  can  construct  an  algorithm  that  tests  if  a  given  graph  has 
branchwidth  at  most  3  and,  if  so,  outputs  a  branchwidth  decomposition  of  mini¬ 
mum  width,  and  that  uses  0{n)  time. 
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4  A  linear  time  algorithm  for  graphs  with  branchwidth 

<  ^ 

In  this  section,  we  will  show  the  following  theorem. 

Theorem  9  For  every  k,  one  can  construct  an  algorithm,  that  given  a  graph 
G  —  {V,E),  decides  whether  the  branchwidth  of  G  is  at  most  k,  and  if  so, 
constructs  a  branch  decomposition  of  G  of  minimum  width,  and  that  uses  0(|K|) 
time. 

(Note  that  if  the  branchwidth  of  a  graph  is  bounded  by  a  constant,  then  \E\  = 
0(|F|).)  While  the  theorem  generalises  the  result  of  the  previous  Section,  it 
should  be  noted  that  the  algorithm  here  has  a  large  (exponential)  constant  factor, 
making  it  (at  least  without  considerably  optimisations  in  the  implementation) 
not  practical,  whereas  the  algorithm  in  the  previous  section  for  the  case  that 
A"  =  3  is  a  practical  and  efficient  algorithm. 

Note  that  a  non-constructive  version  of  Theorem  9  can  almost  directly  be  ob¬ 
tained  from  the  results  in  [16]  and  [6]  as,  for  every  k,  the  class  of  graphs  with 
branchwidth  at  most  k  is  closed  under  taking  of  minors.  Also,  note  from  the 
result  in  [6],  that  it  is  sufficient  to  prove  the  following  result: 

Lemma  11  For  every  k,  I,  one  can  construct  an  algorithm,  that  given  a  graph 
G  —  {V,E),  ivith  a  tree  decomposition  of  G  of  width  at  most  I,  decides  whether 
the  branchwidth  of  G  is  at  most  k,  and  if  so,  constructs  a  branch  decomposition 
of  G  of  minimum,  width,  and  that  uses  linear  time. 

A  terminal  graph  is  a  triple  G  =  {V,  E,  X)  with  G  =  {V,  E)  a  graph,  and  X  an 
ordered  set  of  vertices  from  V .  If  |A|  =  k,  we  call  {V,  E,  X)  a  Ar-terminal  graph. 
Given  an  extended  branch  decomposition,  we  can  build  a  branch  decomposition 
of  the  same  graph  with  the  same  width  as  follows:  repeatedly  remove  leaves 
that  have  no  edge  associated  with  them  and  contract  over  nodes  of  degree  2  in 
the  tree.  We  call  the  obtained  branch  decomposition  the  shrunken  form,  of  the 
extended  branch  decomposition. 

Let  G  =  {V,  E)  be  a  graph,  and  lei  H  =  {V\  E')  be  a  subgraph  of  G.  A  branch 
decomposition  (T,  r)  of  G  is  an  extension  of  a  branch  decomposition  [T' ,  r')  of 
H ,  if  (T' ,F)  can  be  obtained  as  follows:  let  t"  be  the  restriction  of  r  to  those 
leaves  that  map  to  an  edge  in  E' .  (T,  F')  is  an  extended  branch  decomposition 
of  H .  Now  let  {T' ,  r')  be  the  shrunken  form  of  (T,  r"). 

The  full  model  of  a  branch  decomposition  (T,  r)  of  a  terminal  graph  G  — 
[V,  E,  X)  is  the  4-tuple  (T,  r,  (3,  7),  where  /?  is  a  function,  that  maps  each  edge  e 
of  T  to  the  set  of  vertices  in  X  that  belong  to  e,  and  7  is  a  function  that  maps 
each  edge  in  T  to  its  order. 

A  model  of  a  branch  decomposition  (T,  r)  of  terminal  graph  G  =  (V,  E,X)  is  a 
4-tuple  that  is  obtained  by  the  full  model  of  (T,  r)  by  applying  0 

or  more  of  the  following  operations: 

♦  Suppose  {u,  u;}  and  {v,x}  are  edges  in  T,  w,  x  leaves  in  T,  and  /?({?;,  ir>})  C 
/?({v,y}),  j3{{v,x])  C  /?({t^,2/});  y  the  third  neighbor  of  v  in  T.  Then  remove 
edges  and  from  T,  and  restrict  r,  /?,  and  7  to  the  edges  in  the 

smaller  tree. 
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*  Suppose  vi ,  U2)  •  •  - 1  '^r  form  a  path  in  T,  with  all  vertices  V2,  •  ■  -  Vr-i  adjacent  to 
a  leaf  v\ ,  Vr)  in  T.  Suppose  that  l3({vi ,  U2})  =  '^3})  ==  •  •  •  ^r})- 

Suppose  also  that  7({ui,t;2})  <  min{7({v2, -^^3},  •  •  • ,  7({^r~2, })  and  that 
7({iV_i ,  iv)}  <  max{7({'C2, '^^3}),  •  •  • ,  7({^r-2, '^r-i})-  Then,  identify  V2  and 
■?7_i,  remove  vertices  1^3,...,  Vj^—2}  their  adjacent  edges,  their  leaf-neighbors, 
and  the  leaf-neighbor  (/  17)  of  i;2. 

The  characteristic  of  a  branch  decomposition  (T,  r)  of  terminal  graph  G  - 
{V,E,X)  is  the  model  of  the  characteristic  that  cannot  be  reduced  by  applying 
one  of  these  two  operations.  One  can  show  that  the  characteristic  of  a  branch 
decomposition  is  unique. 

Lemma  12  Let  k  be  fixed.  There  are  functions  fi,  such  that  each  characteristic 
of  a  branch  decomposition  of  a  terminal  graph  G  =  (V,E,X)  of  width  at  most 
k  has  at  most  /i(|A^|)  nodes  on  its  tree,  and  each  terminal  graph  G  -  (V,E,X) 
has  at  most  /2(|Ar|)  different  characteristics  of  branch  decompositions  of  G  of 
width  at  most  k. 

Lemma  13  Let  Gi  =  (Vi,^i,A:),  G2  =  {V2,E2,X)  be  terminal  graphs  with 
the  same  set  of  terminals.  If  H  =  [Vs,  E3,  X)  is  a  terminal  graph,  and  (Ti,ti), 
(T2,r2)  are  branch  decompositions,  respectively  of  Gi  and  G2,  with  the  sam.e. 
characteristic,  then  there  is  an  extension  of  (Ti,Ti)  that  is  a  branch  decomposi¬ 
tion  ofGi^H  of  width  <  k,  if  and  only  if  there  is  an  extension  of  (T2,  ts)  that 
is  a  branch  decomposition  of  G2  ®  E  of  width  <  k. 

The  two  lemmas  above  imply  together  that  the  property  that  a  graph  has  branch- 
width  at  most  k  is  finite  state,  or  regular  (see  e.g.  [1]).  (A  class  of  graphs  Q  is 
finite  state,  if  the  equivalence  relation  on  /-terminal  graphs  G  H  (V/\  . 
G  K  e  G  H  ^  K  e  G)  h.3i,s  3i  finite  number  of  equivalence  classes,  for  each 
fixed  /.) 

Combining  the  above  results,  and  results  and  techniques  from  [1,  3,  6,  7,  13],  we 
obtain  the  following  result.  (The  results  show  that  each  of  these  is  computable, 
although  real  practical  efficiency  and  doabily  is  not  guaranteed.) 

Theorem  10  One  can  construct,  for  each  k, 

.  linear  time  algorithms  that  decide  whether  a  graph  has  branchwidth  at  m.ost  k 
(these  algorithms  can  either  use  a  tree-  or  branch  decomposition  and  dynamic 
programming,  or  use  graph  reduction), 

*  parallel  algorithms  that  decide  whether  a  graph  has  branchwidth  at  most  k, 
that  use  0(log7ilog*  n)  time  on  a  EREW  PRAM  or  O(logn)  time  on  a  CROW 
PRAM  and  0(n)  operations, 

.  a  sentence  in  monadic  second  order  logic  expressing  whether  a  graph  has 
branchwidth  at  most  k  or  not, 

.  the  obstruction  set  of  the  graphs  of  branchwidth  at  m.ost  k. 

However,  we  do  not  only  want  to  decide  whether  the  branchwidth  is  at  most  k, 
but  also  to  build  a  corresponding  branch  decomposition. 

A  model  (Ti,  ri, /li,7i)  is  dominated  by  a  model  (T2,  r2, /?2, 72)5  T  (T2,  r2,  ^^2, 72) 
can  be  obtained  from  (Ti,  ti,  /?i,  71)  by  0  or  more  of  the  following  operations: 
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♦  Contract  some  edges  in  Ti . 

•  Remove  vertices  from  sets  /?i(e),  for  some  edges  e  in  T, 

.  Decrease  numbers  /?i  (e)  for  some  edges  e  in  T. 

Lemma  14  Let  Gi  =  {Vi,Ei,X),  G2  -  (V2,  -£'2,  X)  be  terminal  graphs  with  the 
same  set  of  terminals.  If  H  —  (V3,  E^,X)  is  a  terminal  graph,  and  if  the  charac¬ 
teristic  of  branch  decomposition  (Ti,  n)  ofGi  is  dominated  by  the  characteristic 
of  branch  decomposition  (T2,r2)  of  G2,  then  if  there  is  an  extension  of  [Ti,ri) 
that  is  a  branch  decomposition  of  G\  ®  H  of  width  at  most  k,  then  there  is  an 
extension  o/(T2,r2)  that  is  a  branch  decomposition  ofG2®H  of  width  at  most 
k. 

A  full  set  of  characteristics  (for  branchwidth  —  ^ j  of  a  terminal  graph  G  — 
{V,E,X)  is  a  set  of  characteristics  5  of  branch  decompositions  of  G  of  width  at 
most  k,  such  that  each  characteristic  of  a  branch  decomposition  of  G  of  width 
at  most  k  is  dominated  by  an  element  of  S. 

Suppose  we  have  a  tree  decomposition  \  i  £  I},T  —  (/,  F))  of  a  graph  G, 
with  T  a  rooted  binary  tree.  To  each  i  G  /,  we  associate  the  terminal  graph  Gi  = 
[Vi,  Ei,Xi),  with  Vi  —  Uj  jg  ^  descendant  of  i  or  j  ~  “  ^[V^[Vi])- 

The  following  lemma  (with  a  proof  that  resembles  a  technique  from  [8])  gives  us 
a  method  to  compute  full  sets  of  characteristics. 

Lemma  15  Let  k,  I  be  constants.  Let  ({Xj  [  f  G  /},  T  =  (/,  F))  be  a  tree  decom.- 
position  of  G  of  width  at  most  1.  For  any  node  i  G  I  with  at  most  2  children  in 
T,  we  can  compute  a  full  set  of  characteristics  for  branchwidth  —  k  of  Gi,  given 
full  sets  of  characteristics  for  branchwidth  =  k  of  all  terminal  graphs  associated 
with  the  children  of  i. 

Using  the  algorithm  of  the  lemma  above,  we  can  compute  full  sets  of  charac¬ 
teristics  for  all  graphs  associated  with  nodes  in  the  given  tree  decomposition  of 
the  input  graph,  in  time,  linear  in  the  size  of  T  (process  nodes  in  a  bottom-up 
order) .  When  the  full  set  of  the  root  node  is  non-empty,  the  branchwidth  of  the 
input  graph  is  at  most  k,  otherwise  not.  Finally,  by  extra  bookkeeping,  we  can 
build  a  branch  decomposition  of  width  at  most  k  (when  existing),  in  linear  time. 
As  the  details  are  cumbersome,  they  are  omitted  here. 
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Abstract.  We  present  a  finite,  special,  and  confluent  string-rewriting 
system  for  which  the  word  matching  problem  is  undecidable.  Since  the 
word  matching  problem  is  the  non-symmetric  restriction  of  the  word  uni¬ 
fication  problem,  this  presents  a  non-trivial  improvement  of  the  recent 
result  that  for  this  type  of  string-rewriting  systems,  the  word  unification 
problem  is  undecidable  (Otto  1995).  In  fact,  we  show  that  our  undecid¬ 
ability  result  remains  valid  even  when  we  only  consider  very  restricted 
instances  of  the  word  matching  problem. 


Key words:  matching,  unification,  equational  theory,  string- rewriting  systems 

1  Introduction  and  basic  definitions 

Equational  unification  and  matching  have  generated  a  lot  of  interest  recently, 
mainly  due  to  their  importance  in  term-rewriting  systems  and  equational  rea¬ 
soning.  Historically,  one  of  the  earliest  equational  unification  problems  that  have 
been  studied  extensively  is  word  unification,  which  is  the  problem  of  solving  word 
equations^.  The  general  question  of  whether  the  solvability  of  a  word  equation 
is  decidable  or  not  remained  open  for  a  long  time,  until  it  was  finally  settled 
positively  by  Makanin  [Mak77]. 

Since  Makanin ’s  paper  appeared,  his  algorithm  has  been  the  subject  of  many 
research  activities.  The  objectives  have  been  to  simplify  the  proof  of  the  ter¬ 
mination  and  correctness  of  his  algorithm  [Pec81,  Sch93],  to  develop  simpler 
algorithms  for  deciding  the  solvability  of  word  equations  [Jaf90,  Sch90],  and  to 
compute  a  description  for  the  set  of  all  solutions  of  a  solvable  word  equation 
[MaAb94].  Observe  that  a  word  equation  can  have  a  minimal  complete  set  of 

Partially  supported  by  the  NSF  grants  CCR-9404930  and  INT-9401087. 

"  This  is  also  known  as  Markov’s  problem  or  Lob’s  problem. 
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most  general  unifiers  that  is  infinite,  that  is,  the  theory  of  associativity  is  of 
unification  type  infinitary. 

Makanin  also  extended  his  result  further  by  showing  that  the  word  unifica¬ 
tion  problem  is  decidable  for  finitely  generated  free  groups  [Mak83,  Mak85].  Since 
finitely  generated  free  groups  can  be  specified  by  finite,  special,  and  confluent 
string-rewriting  systems,  this  leads  naturally  to  the  question  of  whether  the  solv¬ 
ability  of  word  equations  modulo  finite,  special,  and  confluent  string-rewriting 
systems  is  decidable  in  general.  Here  a  string- rewriting  system  is  called  special, 
if  it  only  contains  rules  of  the  form  £  — A,  where  A  denotes  the  empty  string. 
Obviously,  rewriting  modulo  such  a  system  is  particularly  simple,  since  it  simply 
amounts  to  the  deletion  of  substrings.  A  special  system  R  is  called  confluent, 
if  each  string  has  a  unique  irreducible  descendant  with  respect  to  the  reduction 
relation  induced  by  R.  In  this  situation  the  set  IRR(i^)  of  irreducible  strings 
modulo  R  forms  a  set  of  unique  representatives  for  the  Thue  congruence  in¬ 
duced  by  R  (see,  e.g.,  [BoOt93]).  But  here  the  answer  to  the  solvability  question 
turned  out  to  be  negative  as  shown  by  Otto  [Ott95],  who  presents  a  particular 
finite,  special,  and  confluent  string-rewriting  system  for  which  the  word  unifica¬ 
tion  problem  is  undecidable. 

Now  where  exactly  is  the  borderline  between  the  decidable  and  the  undecid¬ 
able  cases  of  the  problem  of  deciding  the  solvability  of  word  equations?  On  the 
one  hand,  one  could  try  to  restrict  the  finite  string-rewriting  systems  considered 
even  further.  A  reasonable  candidate  would  be  the  class  of  finite,  special,  and 
confluent  string-rewriting  systems  that  present  groups.  Is  the  solvability  of  word 
equations  in  general  decidable  or  undecidable  for  this  class  of  string-rewriting 
systems?  (This  question  is  still  open.) 

Here  we  follow  a  different  approach.  Instead  of  restricting  the  class  of  string¬ 
rewriting  systems  considered  even  further,  we  put  an  additional  restriction  on  the 
form  of  the  word  equations  that  we  admit.  While  a  typical  instance  of  the  word 
unification  problem  consists  of  a  pair  of  strings  {u,v),  where  both  u  and  v  contain 
variables  that  must  be  instantiated  in  order  to  get  instances  e{u)  and  ^(t^)  that 
are  congruent  modulo  the  system  R  considered,  we  look  at  word  equations  of  the 
form  (u,v),  where  only  one  side,  say  w,  contains  variables.  Hence,  such  a  word 
equation  has  a  solution  modulo  R  if  and  only  if  there  exists  an  instantiation  9 
such  that  the  strings  6{u)  and  v  are  congruent  modulo  R.  This  restricted  version 
of  the  word  unification  problem  is  known  as  the  word  matching  problem. 

Here  we  strengthen  the  above-mentioned  undecidability  result  by  showing 
that  there  is  a  finite,  special,  and  confluent  string-rewriting  system  for  which 
the  word  matching  problem  is  undecidable.  In  fact,  we  consider  rather  restricted 
instances  of  the  word  matching  problem,  since  we  look  at  word  equations  of 
the  form  (u,A).  Recall  that  A  is  used  to  denote  the  empty  string.  Such  a  word 
equation  has  a  solution  modulo  R  if  and  only  if  there  exists  an  instantiation 
9  of  the  variables  occurring  in  u  such  that  the  string  9{u)  is  congruent  to  the 
empty  string  A  modulo  R.  We  present  a  finite,  special,  and  confluent  string¬ 
rewriting  system  for  which  this  restricted  variant  of  the  word  matching  problem 
is  undecidable. 
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This  paper  is  organized  as  follows.  In  Section  2  we  present  a  finite,  special, 
and  confluent  string-rewriting  system  TZ  for  which  the  word  matching  problem 
is  undecidable,  thus  establishing  a  weak  version  of  the  intended  undecidability 
result.  In  the  following  section  we  extend  7^  to  a  finite,  special,  and  confluent 
system  for  which  even  the  above-mentioned  restricted  variant  of  the  word 
matching  problem  is  undecidable. 

One  may  ask  why  we  actually  give  both  these  proofs,  since  the  latter  result  is 
clearly  stronger  than  the  former.  However,  the  proof  of  the  former  is  simpler  and 
therefore,  it  illustrates  the  key  ideas  used  in  the  reductions  more  clearly.  Further, 
the  technical  results  on  the  system  %  established  in  Section  2  are  needed  anyway 
in  proving  the  stronger  result  in  Section  3. 

We  close  this  section  by  providing  the  main  definitions  and  notation  necessary 
to  follow  our  arguments.  For  a  more  detailed  treatment  of  the  basics  and  for 
additional  information  on  the  notions  introduced,  we  refer  the  interested  reader 
to  the  following  surveys  -  [BaSi94,  JoKi91]  for  unification,  [DeJo90]  for  term¬ 
rewriting,  and  [BoOt93]  for  string-rewriting. 

A  string -reioriting  system  (often  called  a  ‘Thue  system’)  on  an  alphabet  E  is 
a  finite  set  of  pairs  of  strings  R  C  E*  x  E* .  In  this  note  we  will  only  be  dealing 
with  string- rewriting  systems  that  are  length-reducing^  that  is,  we  assume  that 
1^1  >  |r|  for  each  pair  (^,  r)  G  R.  These  pairs  are  often  referred  to  as  rewrite  rules 
and  are  sometimes  represented  as  ^  ^  r.  A  string-rewriting  system  R  is  said  to 
be  special  if  r  =  A  for  each  pair  (^,  r)  in  R.  As  mentioned  before,  A  denotes  the 
empty  string. 

By  -^R  we  denote  the  single-step  reduction  relation  that  is  defined  by  the 
string-rewriting  system  R:  —^r  :=  {(ulv^urv)\{l^r)  G  R,u,v  G  E*}.  Its  re¬ 
flexive  and  transitive  closure  is  the  reduction  relation  induced  by  R.  The 
relation  :=  {-^r  U  Y  is  called  the  Thue  congruence  generated  by  R.  By 
IRR(i?)  we  denote  the  set  of  irreducible  strings  modulo  J?,  that  is,  u  G  IRR{R) 
if  and  only  if  u  —^r  v  does  not  hold  for  any  string  v. 

Let  V  be  a  set  of  variables  that  range  over  E* .  A  string  equation  or  word 
equation  is  an  equation  of  the  form  u  =  v  where  u  and  v  are  strings  over  (i7uy)* . 
An  assignment  or  substitution  6  :  V  ^  E""  \s  n  solution  of  the  equation  u  —  v 
modulo  R  if  and  only  if  0{u)  ^\9{v).  Here  0  is  extended  to  a  morphism  9  : 
(F  U  17)*  E*  in  the  obvious  way. 

The  word  matching  problem  for  a  string-rewriting  system  R  is  the  following 
decision  problem: 

INSTANCE  ;  A  string  w  G  (V  U  EY  and  a  string  v  e  E\ 

QUESTION  :  Is  there  a  substitution  9  satisfying  9(u)  v? 

2  Undecidability  of  the  word  matching  problem 


The  announced  undecidability  result  will  be  proved  by  a  reduction  from  the 
following  undecidable  problem. 
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Theorem  1.  [NaOtQO] 

There  exists  a  set  of  pairs  of  non-empty  strings  P  =  |  i  =  1,  •  •  ■  pi}  C 

{a,6}+  X  {a,6}+  such  that  the  following  problem  is  undecidable: 

INSTANCE  :  Two  non-empty  strings  Xo,  yo  G  {a,  6}"*". 

QUESTION  :  Do  there  exist  indices  eac/i  ij  from  such 

that  XgXi^Xi^  '  •  '  Xif^  —  J/02/?i2/i2  *  '  ' 

We  reduce  this  problem,  which  is  a  specialized  form  of  the  well-known  Mod¬ 
ified  Post  Correspondence  Problem  (MPCP),  to  the  word  matching  problem 
modulo  a  particular  finite,  special,  and  confluent  string-rewriting  system.  The 
system  we  construct  is  on  the  alphabet  consisting  of  the  letters  a  and  b,  symbols 
for  each  of  the  numbers  1  to  n,  and  special  symbols  S,  ff,  and  $.  In  other  words, 
let 

The  rules  of  our  system  are  divided  into  three  classes.  The  first  class  corre¬ 
sponds  to  the  xfs,  the  first  components  of  the  pairs  in  P,  the  second  to  the  yfs, 
and  the  third  class  is  to  ensure  that  a  string  is  indeed  from  {a,b}*. 

The  rules  from  Class  I  are 

XiSi  A,  i  G  {1,  . . . , 

Class  II  consists  of 

yi  i  S  ^  A, 

Class  III  consists  of  the  rules 

affff  — !•  A,  bff^  — >■  A,  a$$  — >■  A,  A. 

Observe  that  the  string-rewriting  system  P  that  consists  of  the  above  three 
classes  of  rules  is  a  finite  special  system  that  is  in  addition  confluent.  In  fact, 
there  are  no  non-trivial  critical  pairs  at  all  for  this  system  (see,  e.g.,  [BoOt93]). 

We  will  show  that  the  simultaneous  variant  of  the  word  matching  problem  is 
undecidable  for  this  system  P,  which  is  the  following  decision  problem: 

INSTANCE  :  A  finite  sequence  (ui,  I’l), . .  . ,  (r^m,  Vm)  of  pairs  of  strings  from 

{vury  X  r*. 

QUESTION  :  Is  there  a  substitution  6  satisfying  0{ui)^^Vi  simultaneously 
for  all  ^  =  1,  ... ,  m? 

Since  this  simultaneous  variant  of  the  word  matching  problem  is  reducible  to 
the  (577?,^/e)  word  matching  problem  by  introducing  a  new  letter,  say  (j-. ,  this  will 
give  our  intended  undecidability  result. 

Lemma  2.  Let  X  be  an  irreducible  string  from  P* .  Then 

XifYi  A,  XYi#  A,  X$y2  a,  and  XYsS  A, 
for  some  Yi,  Y2  G  E*  if  and  only  if  X  €  {a,  6}*. 
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Proof;  The  'if’-part  is  trivial.  The  proof  of  the  ‘only  if’-part  is  by  contradiction. 
Let  Z  be  a  shortest  counterexample,  that  is,  a  shortest  irreducible  string  such 
til  at 

Z#y\  K  and  ZY2$  A, 

for  some  Yi,  Y2,  where  Z  ^  {a,  6}*.  Clearly  ^  cannot  end  in  a  S  or  a  #,  for  if 
Z  =  Z'%,  then  Z'$#yi  will  not  be  reducible.  Thus  Z  has  to  end  in  either  an  a 
or  a  6. 

Let  Z  =  Z'a,  where  a  G  {a,  6),  For  Z'ajj=Y\  to  be  reducible,  the  leftmost 
symbol  of  Y\  must  be  a  #.  Because  of  the  second  condition  ZYiH^  A,  the 
next  (that  is,  second  from  the  left)  symbol  in  Yi  must  also  be  a  #.  In  other 
words,  Yi  =  ##Y3  for  some  Y3.  By  similar  reasoning,  Y2  —  $$Y4  for  some  Y4. 
Now, 

^'cv###Y3  -7^  Z'#Y3  A, 

Z'a##Y3#  --7^  Y'Y3#  A, 

Z'a$UY4  -^71  ^'SY4  A,  and 

^'cv$$Y4$  -^7^  Y'Y4$  A, 

which  shows  that  Z'  is  a  shorter  counterexample.  This  contradicts  the  choice  of 
Y,  and  hence,  we  conclude  that  there  is  no  such  counterexample,  D 

Lemma 3.  Lei  xo,yo  G  {a,b}'^.  Then,  for  Xi,X2  €  {a,^}  ? 

XiSY  ^0*5'  and  X2Y  yo 

for  some  Y  if  and  only  if  there  exist  indices  ji, . . . ,  G  {1, . . . ,  n)  such  that 

Xi  =  •  --Xj,  and  X2  =  2/oyji  *  --ysf 

Proof;  The  ‘if’  part  is  straightforward.  By  examining  the  rules  from  Classes  I 
and  II,  we  can  easily  see  that  by  taking  Y  ;=  jtS  ■  •  •  j\S  we  can  reduce  XySY 
by  Class  I  rules  to  xqS,  and  X2Y  by  Class  II  rules  to  yo- 

We  prove  the  ‘only  if’  part  by  contradiction.  Let  U\,U2  E  {a,  h)*  be  minimal 
counterexamples  in  terms  of  their  combined  length.  Obviously,  we  may  assume 
without  loss  of  generality  that  Y  is  irreducible.  Clearly  Y  A;  if  Y  were  A,  then 
U^s  —  X[)S  and  U2  =  yo- 

If  Y  A,  then  the  only  rules  that  are  applicable  to  UiSY  are  the  rules  from 
Class  I.  Thus  there  must  be  a  rule  XpSp  A  and  strings  U[  and  Y'  such  that 
Ui  =  U[xp,  Y  ^  pY',  and  UiSY  UiY'  xqS. 

Then  U2Y  =  U2PY'  must  also  be  reducible,  and  it  can  be  seen  that  only  the 
rule  yppS  — ^  A  from  Class  II  can  apply.  In  other  words,  U2  =  Y'  —  SY" 

for  some  Ui„  Y",  and  l]2pY'  2/o- 

Thus  we  get 

U[SY"^*^xoS  and  U!,Y‘^ 
and  this  contradicts  the  minimality  of  Ui  and  U2- 
The  following  result  is  now  immediate. 
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Lemma 4.  Let  G  {a,6}+.  Then  there  is  a  string  X  G  {a,  6}*  satisfying 

XSY  xqS  and  XY  yo 

for  some  Y  if  and  only  if  the  instance  {{xo,yo)}  of  the  MPCP  has  a  solution. 

Combining  the  technical  results  obtained  we  arrive  at  the  following  result. 

Theorems.  For  all  Xo,yo  G  {a,6}+,  the  following  two  statements  are  equiva¬ 
lent: 

(a.)  the  MPCP  has  a  solution  for  {(xo,yo)}; 

(h)  there  exist  strings  X,  Y,  ^2  C  P*  such  that  the  following  congruences  are 
satisfied  simultaneously: 

1.  XSY  xoS, 

2.  XY  yo, 

3.  X#Zi  ^5^  A, 

5.  X%Z2  and 

6.  XZ2%  A. 

Thus,  we  have  the  following  undecidability  result. 

Corollary  6.  The  simultaneous  variant  of  the  word  matching  problem,  is  unde- 
cidable  for  the  finite,  special,  and  confluent  string -rewriting  system  U. 

Now  consider  the  extended  alphabet  A  :=  C  U  {(]:),  and  the  following  two 
strings  u  G  [AUV)*  and  v  ^  A*: 

U  ;=  ViSv2{viV2ivifi^V3^ViV3if{vi%V4{viV4$, 

where  vi, ...  ,04  G  V . 

Then  there  exists  a  substitution  6  :  {vi, ...  ,V4}  ^  A*  satisfying  6{u)  v 
if  and  only  if  the  following  congruences  are  satisfied  simultaneously,  where  X  := 
9{vi),  Y  :=  6>(u2),  ^(^^3),  and  Z2  :=  ^(^4): 

1.  XSY  xqS, 

2.  XY  yo, 

3.  XifZi  A, 

A, 

5.  X$Z2  A,  and 

6.  XZ2$  A. 

By  the  theorem  above  this  means  that  the  instance  (u,  v)  of  the  word  match¬ 
ing  problem  for  7Z  has  a  solution  over  A  if  and  only  if  the  MPCP  has  a  solution 
for  {(a:o,2/o)}-  Hence,  we  obtain  our  first  main  result. 

Corollary  7.  Over  the  alphabet  A  the  word  matching  problem  is  undecidable  for 
the  finite,  special,  and  confluent  string-rewriting  system  11. 


3  A  restricted  variant  of  the  word  matching  problem 


As  described  in  the  introduction  we  are  interested  in  the  following  restricted 
variant  of  the  word  matching  problem,  which  we  call  the  special  word  matching 
problem'. 

INSTANCE:  A  string  u,  G  (C  U  S)* . 

QUESTION:  Is  there  a  substitution  9  satisfying  9{u)  A? 

Here  we  present  a  finite,  special,  and  confluent  string-rewriting  system  7^i  for 
which  this  problem  is  undecidable.  We  obtain  Hi  from  the  system  H  constructed 
in  the  previous  section  by  adding  three  new  rules. 

Let  and  (j:  be  four  additional  symbols,  let  /q  :=  T  U  {a\b'}  {= 

o/,  6'}  U  {1, .  .  .,n}  U  {5,  #,$}),  let  :=  A  U  and  let  7^l  be  the 

following  string-rewriting  system  on  Aq: 

Hi  :=nu  {aa'  A,  bb'  ^  ^  A}. 

It  is  easily  seen  that  Hi  is  a  finite,  special  string-rewriting  system  that  is 
confluent. 

Now,  for  given  strings  o^o,  yo  £  consider  the  following  instance  of 

the  special  word  matching  problem  for  Hi : 

'^iy2Sy'^  @viSv2x'^ 

where  '  :  {a,  6}*  ->  {a',b'}*  denotes  the  canonical  isomorphism  induced  by 
a  a'  and  b  6',  and  denotes  the  reversal  of  the  string  u.  Here  vi,  V2,  V3,  V4 
are  variables. 

Lemma  8.  If  the  instance  {(aJo,yo)}  of  the  MPCP  has  a  solution,  then  the  above 
instance  of  the  special  word  matching  problem  has  a  solution  for  Hi. 


Proof.  Let  U,  •  •  -  ,4  G  {1,2, . .  .,n}  such  that  xoXi^Xi^  "'Xi,,  =  yoyi.yi^  *  •  •  2/^ 
Let  wi  xoXi^Xi^  ■  •  •  Xi,^ ,  W2  ikSik-iS  •  •  •  hSii,  W3  :=  ._ 

Then  we  have  the  following  reductions  modulo  T^i: 


wiW2Sy'f"  =  yoyi,  ■  ••  yif^ikS  -  •i2SiiSyif'  ^ 
iviSivox'^  =  XQXi^  •  • '  Xij^SikSik-i  •  •  -  hSiiXQ 
WI#IU3  -W1W3#  =  - 

u;i$z/;4  =  It;  11614$  =  ^ 


y^y'f 


->■  XqXq 

A,  and 


Thus,  if  (p  denotes  the  morphism  defined  by  {vi  ^  Wi  |  z  =  1, . .  .,4},  then 
(piviV2Sy'(f^ @  •  •  ■(|:z;i?;4S)  =  u;it6!25'yo~@  •  •  •  (|:u;iZ6;4$ 


We  claim  that  also  the  converse  implication  holds.  So  let  be  a  morphism 
satisfying  •  ■  ■  {'^1^4$)  let  Wi  :=  (p{vi),  i  =  1,...,4. 

Without  loss  of  generality  we  can  assume  that  wi, . .  .,W4  are  irreducible  modulo 
7^l,  that  is,  wi,...,W4e  IRR(7^l).  Denote  simply  by 
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Lemma  9.  wi, . .  . ,  W4  ^  Fq  . 

Proof, 

Claim  1.  =  0. 

Proof.  Assume  that  W4  =  9i@g2  for  some  gi  G  4^5  and  92  G  (^Aq  \  {@})*. 
Since  w  ends  in  104$,  and  since  ^  A  is  the  only  rule  of  TZi  containing  an 

occurrence  of  the  symbol  we  see  that  w  A  implies  that  \g2\^  >  0?  that 

is,  g2  =  93^94  for  some  93  G  Pq-  Hence,  W4  =  91^93^94,  and  93  This, 

however,  contradicts  our  assumption  that  W4  G  IRR(7^i).  Thus,  —  0.  □ 

Claim  2.  \wi  —  0. 

Proof.  This  follows  analogously.  ^ 

Claim  3.  \wi\@  =  0. 

Proof.  Assume  that  wi  =  gi@hi  for  some  gi  E  {^0  ^  {i})*  G  Tg* .  Since 

lu  ends  in  wiW4%,  this  implies  that  W4  —  ^2^92  for  some  92  G  Fq  satisfying 
hih2  — A.  Since  w  also  contains  the  substring  t(;iSu;4,  and  therewith  the 
substring  we  see  that  also  hi$h2  — A  must  hold.  The  system  7^i 

contains  only  two  rules  that  involve  occurrences  of  the  symbol  $  :  a$$  ^  A  and 
6$$  A.  Hence,  /ii$/i2  A  implies  that  \h1h2U  =  1  iriod  2,  while  /ii/12 
A  implies  |/^l/^2|$  =  0  mod  2.  This  contradiction  yields  \wi\@  =  0.  □ 

The  string  w  begins  with  the  prefix  W1W2.  Since  \wi\@  =  0,  we  obtain  the 
following  analogously  to  Claim  1. 

Claim  4.  \w2\^  =0. 

Claim  5.  =  0- 

Proof.  Assume  that  W3  =  gi@hi  for  some  gi  G  and  hi  €  (4\o  \  {@})*.  Since 
u;3  is  irreducible,  we  see  that  \hi\^  =  0,  that  is,  hi  G  Fq  .  Since  w  contains  the 

substrings  tn3(j:  and  'W3^\.,  we  conclude  that  hi  A  and 

1V3  being  irreducible  yields  hi  =  A,  which  in  turn  implies  that  hi^  —  ~^ni 

Thus,  |u^3|@  =  0.  ^ 

Claim  6.  |u^3||  =  0. 

Proof.  Let  W3  —  gi^hi  for  some  gi  G  Fq  and  hi  G  (4\o  ^  {@})*.  Since  w 
contains  the  substrings  @wi^W3  and  ^wiws,  we  see  that  w  —^7^^  A  and  wi  G  Fq 
imply  that  Wiif^gi  A  and  wigi  A.  The  only  rules  of  1Zi  that  contain 

occurrences  of  the  symbol  #  are  the  following  two:  a##  ^  A  and  6##  — ^  A. 
Hence,  wiH^gi  — >5^^  A  implies  that  ^  1  mod  2,  while  u;ii7i  ^  implies 

that  I#  =  0  mod  2.  Thus,  |rt;3|^  =0.  D 

Because  of  wi  G  Fq  and  w  — A,  the  fact  that  w  contains  the  substrings 
@ivi$W4  and  @'u;i'u;4  implies  analogously  the  following. 

Claim  7.  |u^4|^  =  0. 
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By  Claims  2,  4,  6,  and  7  wiy  W2,  ws,  W4  €  (4?i>o  ^  {i})*-  Thus,  |u;|^  —  5.  Hence, 
we  also  have  \iv\@  =  5,  and  so  wi,W2,W3,W4.  G  Fq  .  This  completes  the  proof  of 
Lemma  9.  ^ 

Since  lu  A,  we  can  conclude  the  following  from  Lemma  9: 


(1-) 

WiW2Si/q^ 

_ 

(2.) 

WiSw2x'(f' 

,  * 

(3.) 

Wi#W3 

* 

(4.) 

u;iu;3# 

'^Tli 

(5.) 

Wi$W4 

* 

(6.) 

WiW4$ 

From  Lemma  2  and  its  proof  we  see  that  the  reductions  (3.)  to  (6.)  imply 
that  lui  G 


LemmalO.  Lei  xo,yo  E  {a,b}'^ .  Then,  for  Xi,  X2  E  {a,b}* , 

XiSYx'^^  A  and  X2Y Sy'^  A 

for  some  Y  E  Fq  if  and  only  if  there  exist  indices  ii,.  .  .,ii  E  {1, .  . . ,  77}  such 
that 

Xi  =  xoXi^  •  •  •  Xi,  and  X2  =  yoyii  *  •  •  Vii  • 


Proof.  If  yYi  =  xoXi,  •  -Xi,  and  JY2  =  yoyi,  '-yit,  choose  Y  :=  iiS--i2Sii. 
Then 

Ah5'yxo~  =  XQXi^  •  •  •  Xi,SiiS  •  •  •225nXo"  xoXo~  A  and 
X2YSy'f^  =  T/oT/ii  •  •  -yiM^  •  -hSiiSy^^  V^v'^ 

We  prove  the  converse  implication  by  contradiction.  Let  t/i,  C/2  C  {^,6}*  be 
minimal  counterexamples  in  terms  of  their  combined  length  such  that  UiSYx'^ 
A  and  U2Y Sy'^  A  for  some  Y  E  F^ .  Obviously,  we  may  assume 

without  loss  of  generality  that  Y  E  Tq  irreducible  modulo  Hi.  Clearly,  T  /  A, 
since  f/i5xo~  E  {o.,b}*  •  5  •  {a^  C  IRR(7^i). 

Claim  1.  Y x'^  E  IRR(7?.i). 

Proof.  If  yx'o"  is  reducible  modulo  7^l,  then  y  =  yic  and  xff'  =  c' z'  for  some 
c  E  {u,C}.  However,  then  U2Y Sy'^  ends  in  cSy’f^ ,  and  we  see  from  the  form  of 
the  rules  of  7?.i  that  each  descendant  of  U2Y Sy'^T  then  also  ends  in  cSy^o'^ .  □ 

Claim  2.  YSy^o^  E  IRR(7ei). 

Proof.  IfYSij'f^  is  reducible  modulo  77i,  then  Y  =  Yiyii  for  some  i  G  {1, , . . ,  n}. 
Hence,  UiSYx'ff^  ends  in  yiixQ^ ,  and  therewith  each  descendant  of  UiSYx'^f'  also 
ends  in  yiix',f' ,  since  yi  E  {a,b}'^.  Thus,  Y Sy'f"  G  IRR(77i).  D 

Thus,  the  only  rule  that  is  applicable  to  UiSYx'^  is  of  the  form  XpSp  A, 
that  is,  C/i  =  U[xp,  Y  -  pY',  and 

UiSYx',-  -.7^  UiY'x',-  A. 
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Hence,  U2y -  U2pY'Sy'^ ,  and  since  Y'Sy'^  is  irreducible  by  Claim  2,  we 
see  that  the  rule  yppS  A  must  apply,  that  is,  U2  —  U^yp,  Y'  =  SY'^,  and 

U2pY'Sy',-  U'^Y’'Sy',-  A. 

Thus,  U[,U!,  G  {a, by  satisfy  U{Y'xy  =  U[SY''xy  A  and  U^Y"Syy 
A  for  some  string  Y^'  G  Py  This,  however,  contradicts  the  minimality  of 
f/iandf/2.  a 

Lemma  10  applied  to  the  reductions  (1.)  and  (2.)  above  implies  that  there 
exist  indices  ii,  . .  . ,  G  {1, .  .  . ,  n}  such  that  wi  =  x^Xi^  •  •  -  Xi^  —  y^yi^  '  ■  'Vitj 
that  is,  the  instance  {(^ro.yo)}  of  the  MPCP  has  a  solution.  This  observation 
together  with  Lemma  8  yields  the  following  equivalence. 

Corollary  11.  The  instance  {(a:o,2/o)}  of  the  MPCP  has  a  solution  if  and  only 
if  the  above  instance  of  the  special  word  matching  problem  has  a  solution  forTli. 

The  choice  of  P  thus  implies  the  intended  undecidability  result. 

Theorem  12.  For  the  finite,  special,  and  confluent  string-rewriting  system 'Ri 
the  special  word  matching  problem  is  undecidable. 

4  Conclusion  and  open  problems 

We  have  shown  that  extending  Makanin’s  result  to  the  general  case  of  all  finite, 
special,  and  confluent  string-rewriting  systems  is  even  ‘more  impossible’  than  we 
thought.  The  simplicity  of  special  confluent  string-rewriting  systems  is  deceptive; 
they  are  powerful  enough  to  even  make  word  matching  problems  undecidable. 

However,  there  is  still  one  interesting  case  that  remains  open,  the  case  of 
finite,  special,  and  confluent  string-rewriting  systems  that  present  groups.  Note 
that  the  systems  constructed  above  do  certainly  not  present  groups,  since  sym¬ 
bols  like  a  do  not  have  left-inverses.  A  helpful  factor  for  attacking  this  open 
problem  is  the  fact  that  the  class  of  groups  that  are  presented  by  finite,  special, 
and  confluent  string-rewriting  systems  can  be  characterized  algebraically:  they 
are  exactly  those  groups  that  are  isomorphic  to  the  free  products  of  a  free  group 
of  finite  rank  and  finitely  many  finite  cyclic  groups  [Coc76].  In  any  case,  even 
if  this  problem  should  turn  out  to  be  decidable,  its  complexity  is  likely  to  be 
very  high,  since  Makanin’s  algorithm  for  free  groups  is  itself  not  even  primitive 
recursive  [KoPa]. 
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Abstract.  We  investigate  mutual  dependencies  of  subexpressions  of  a  com¬ 
putable  expression,  in  orthogonal  rewrite  systems,  atnd  identify  conditions  for 
their  concurrent  independent  computation.  To  this  end,  we  introduce  con¬ 
cepts  familiar  from  ordinary  Euclidean  Geometry  (such  eis  basis,  projection, 
distance,  etc.)  for  reduction  spaces.  We  show  how  a  basis  for  an  expression 
can  be  constructed  so  that  any  reduction  starting  from  that  expression  can 
be  decomposed  as  the  sum  of  its  projections  on  the  a.xes  of  the  basis.  To 
make  the  concepts  more  relevant  computationally,  we  relativize  them  w.r.t. 
stable  sets  of  results,  and  show  that  an  optimal  concurrent  computation  of 
an  expression  w.r.t.  S  consists  of  optimal  computations  of  its  tS-independent 
subexpressions.  All  these  results  are  obtained  for  Stable  Deterministic  Resid¬ 
ual  Structures,  Abstract  Reduction  Systems  with  an  axiomatized  residual 
relation,  which  model  all  orthogonal  rewrite  systems. 


1  Introduction 

Efficient  evaluation  of  expressions  requires  concurrent  evaluation  of  subexpressions. 
In  computation  in  general,  it  is  normal  that  intermediate  results  of  computation 
of  different  subexpressions  are  used  by  other  subexpressions,  and  contribute  to 
creation  of  new  computable  subexpressions.  In  concurrent  languages  like  the  tt- 
calculus  [Mil92]  this  is  expressed  explicitly  by  value-passing,  while  in  sequential  lan¬ 
guages  computations  in  different  subexpressions  can  only  interact  by  joint  creation 
of  new  redexes.  Our  aim  in  this  paper  is  to  give  a  formal  numerical  characterization 
of  dependencies  of  subexpressions  of  an  expression  (or  subprograms  of  a  modu¬ 
lar  program),  and  in  particular  to  identify  conditions  for  independent  evaluation 
of  subexpressions.  Computation  of  different  independent  subexpressions  can  be  con¬ 
ducted  in  isolation  from  computations  elsewhere  in  the  expression,  concurrently,  and 
the  results  can  then  be  combined  to  yield  the  final  result. 

We  restrict  our  attention  to  functional  languages,  and  consider  their  operational 
model  -  orthogonal  rewrite  systems  -  of  which  the  A-calculus  [Bar84]  is  the  prime  ex¬ 
ample,  although  we  believe  that  our  results  can  be  generalized  to  the  non-orthogonal 
case  and  cover  concurrent  languages  as  well.  To  remain  as  general  as  possible,  and 
at  the  same  time  to  avoid  syntactic  structure  of  computable  expressions  (terms, 
graphs,  etc.),  which  is  irrelevant  for  our  purpose,  we  assume  that  the  rewrite  sys¬ 
tem  is  given  in  the  form  of  a  Stable  Deterministic  Residual  Structure,  SDRS  [GK96]. 

Part  of  this  work  was  supported  by  the  Engineering  and  Physical  Sciences  Research 
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SDRSs  are  Abstract  Rewrite  Systems  with  an  axiomatized  residual  relation,  which 
model  all  orthogonal  rewrite  systems.  Standard  important  results  like  the  Standard¬ 
ization  and  Normalization  theorems  can  already  be  proven  in  SDRSs  [GK96,  KG96]. 
Furthermore,  via  Deterministic  Family  Structures^  DFSs  [GK96],  which  are  SDRSs 
with  an  axiomatized  family  relation  on  redexes,  one  can  prove  optimality  results 
of  Levy  [LevSO],  and  achieve  Prime  Event  Structure  [Win89]  style  semantics  for  or¬ 
thogonal  rewrite  systems  in  a  uniform  way  [KG97a]. 

The  idea  we  want  to  pursue  is  very  simple  and  natural,  and  the  concepts  we 
introduce  have  their  counterparts  in  ordinary  Euclidean  Geometry,  although  there 
will  be  some  differences.  For  expository  purposes,  let  us  assume  first  that  the  given 
SDRS  is  linear  -  there  are  no  duplication  or  erasure  of  redexes.  The  main  analogy 
is  the  following.  In  a  Euclidean  3-dimensional  space,  one  can  decompose  a  vector  as 
the  sum  of  its  projections  on  the  axes  X,  Y  and  Z,  which  form  a  Euclidean  basis. 
Similarly,  we  can  construct  a  basis  at  any  expression  t,  consisting  of  independent 
reductions  Pi  starting  from  t,  such  that  any  reduction  P  starting  from  t  can  be 
decomposed  as  the  sum  of  its  projections  on  Pi.  Here  Pi  and  Pj  are  independent 
if  no  finite  initial  parts  of  them  can  interact,  i.e.,  by  joint  creation  of  a  new  redex. 
In  the  basis  that  we  construct,  every  reduction  Pi  is  a  maximal  reduction  internal 
to  Ui,  i.e,  Pi  contracts  residuals  of  redexes  in  Ui  and  created  redexes;  every  Ui  is 
independent,  i.e.,  no  reduction  internal  to  Ui  can  interact  with  a  reduction  internal 
to  the  complement  U  of  U,  which  consists  of  redexes  of  t  not  mU^Ui  are  pairwise 
non-overlapping,  and  cover  all  redexes  of  t. 

Further,  the  distance  \PjQ\  between  co-initial  reductions  P,Q  is  the  number  of 
their  ‘different’  steps,  and  characterizes  ‘how  far  apart’  the  reductions  have  pro¬ 
gressed.  Here  ‘different  steps’  means  that  they  cannot  be  related  by  the  zig-zag  rela¬ 
tion  (which  is  the  transitive  and  symmetric  closure  of  the  residual  relation)  [LevSO], 
so  they  are  in  different  zig-zag  families.  \P,Q\  coincides  with  the  minimal  number 
of  reduction  steps  needed  to  reach  a  common  reduct  from  the  endpoints  of  P  and 
Q.  This  is  different  from  the  Euclidean  measure  of  distance.  For  example,  in  the 

simplest  case,  if  two  vectors  P  and  Q  are  orthogonal  (say  parallel  to  axes  X  and  Y 

I  S2  -.2 

respectively),  then  the  distance  is  |P,  Q\  -  ]J \P\  +  |Q1  ,  while  the  distance  between 
reductions  P  and  Q  that  contract  redexes  in  different  families  is  \P,Q\  =  |P|  +  |Q|- 
However,  this  is  because  the  Euclidean  space  is  continuous  and  allows  ‘shortcuts’. 

If  we  were  to  allow  joining  of  the  endpoints  of  the  vectors  P  and  Q  only  by  moves 
parallel  to  X  and  Y,  then  we  would  get  the  same  distance  measure  as  for  reductions! 

Finally,  the  independence  degree  of  a  redex  set  U  of  an  expression  t  is  the  leng^ 
of  a  shortest  reduction  P  internal  to  U  such  that  there  is  a  reduction  Q  internal  to  U 
that  interacts  with  P,  and  is  00  otherwise.  So  at  least  |P|  steps  can  be  performed  in 
U  independently  from  the  rest  of  the  computation,  after  which  results  of  computing 
U  and  U  must  be  combined  in  order  for  the  computation  to  proceed  ‘as  concurrently 
as  possible’.  Note  that  if  and  only  if  U  is  independent,  its  independence  degree  is  00. 

These  concepts  can  very  naturally  be  explained  in  terms  of  Prime  Event  Struc¬ 
tures  (PES)  [Win89],  which  in  the  conflict-free  case  (in  which  we  are  interested) 
are  simply  event  sets  E  partially  ordered  by  a  causal  dependency  relation  <,  such 
that  every  event  e  €  E  can  only  dominate  a  finite  number  of  others.  Computations 
in  a  linear  SDRS  are  interpreted  as  left-closed  sets  of  events  (i.e.,  closed  under 
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<),  called  configurations^  in  the  PES  S  =  (E,  <)  whose  events  correspond  to  (the 
zig-zag  classes  of)  redexes  in  71.  Those  event  sets  Xi  C  E  that  are  closed  under 
>  are  independent^  as  they  correspond  to  independent  reductions  in  7Z.  Further,  if 
{A',:  I  i  6  /}  are  disjoint  independent  sets  covering  E,  they  form  a  basis  for  E,  as  for 
any  configuration  a,  a  =  U^g/a  n  X^.  Here  a  H  Xi  is  the  projection  of  a  on  A„  and 
coincides  with  the  restriction  of  a  to  the  set  ATf  of  all  initial  (i.e.,  minimal  w.r.t.  <) 
events  of  Xi.  And  the  set  {Af  }ig/  is  an  independent  covering  of  the  set  of  initial 
events  of  E.  Further,  the  distance  between  configurations  a  and  (3  is  defined  as  the 
cardinality  of  q  U  /?  \  Q  H  /3  (as  is  usual  for  sets),  and  it  precisely  corresponds  to  the 
distance  measure  for  reductions  in  linear  SDRSs  -  \P,Q\  =  \ocp,otQ\,  where  ap,aQ 
are  configurations  corresponding  to  P,Q.  The  independence  degree  of  a  set  oq  of 
initial  events  is  the  cardinality  of  the  smallest  configuration  a,  whose  initial  events 
are  in  oq,  such  that  there  exists  a  configuration  /3  not  containing  elements  of  oq  and 
an  event  e  such  that  a  U  /?  U  {e}  is  a  configuration,  while  neither  a  U  {e}  nor  p  U  {e} 
are  (i.e,  a  and  P  both  contribute  to  creation,  or  enabling,  of  e,  and  they  interact  to 
create  e). 

Most  of  the  technical  difficulties  come  from  the  erasure  of  redexes  in  SDRSs. 
To  cope  with  the  erasure  problems,  and  to  have  (most  of  the)  concepts  invariant 
under  Levy-equivalence,  we  work  with  standard  reductions,  which  in  SDRSs  are 
reductions  in  which  later  steps  ‘do  not  erase’  the  preceding  ones  [KG96].  If  the  SDRS 
is  duplicating,  concepts  like  ‘restriction  of  P  to  a  redex-set  IE  cannot  be  defined 
correctly  for  arbitrary  P  -*  we  need  P  to  be  a  family-reduction,  that  is,  a  multi-step 
reduction  contacting  ail  members  of  a  (zig-zag)  family  in  parallel,  in  every  multi-step. 
However,  as  we  have  shown  in  [KG97a],  duplicating  SDRSs  can  be  interpreted  via 
non-duplicating,  also  called  affine,  SDRSs,  and  the  family-reductions  in  the  former 
become  reductions  in  the  latter.  Therefore,  via  that  encoding,  the  results  obtained 
here  for  affine  SDRSs  are  applied  to  all  SDRSs.  (Restriction  to  family-reductions 
is  inevitable  when  one  studies  adequate  simulation  of  a  duplicating  system  with  an 
affine  one  [KKSV94].) 

In  order  to  make  the  introduced  concepts  more  meaningful  computationally,  we 
relativize  them  w.r.t.  the  semantics  one  may  be  interested  in.  For  example,  in  the 
A-calculus,  one  might  be  interested  in  computing  normal  forms,  head- normal  forms, 
weak-head-normal  forms,  etc.  In  [GK96] ,  we  have  characterized  all  reasonable  sets 
of  finite  ‘(partial)  results’  as  stable  sets  S  of  terms,  and  have  shown  that  (only) 
w.r.t.  stable  sets  S,  5-needed  reductions  are  5-normalizing.  This  allows  us  to  ignore 
5-unneeded  redexes,  and  for  example,  we  can  define  P,Q  to  be  S-independent  if 
there  is  no  joint  creation  of  5-needed  redexes.  So  reductions  that  interact  may  be 
5-independent.  This  is  profitable  since  redex  sets  that  are  not  independent  may 
become  5-independent,  and  this  allows  for  finer  independent  splitting  of  redex-sets 
of  terms,  implying  more  parallelism  in  the  computation.  And  indeed,  if  is 

an  5-independent  covering  of  an  5-normalizable  term  t,  we  show  that  an  optimal 
5-normalizing  reduction  is  the  sum  of  maximal  5-needed  reductions  internal  to  LL 

In  Section  2,  we  recall  SDRSs  and  DFSs.  In  section  3,  we  introduce  the  restric¬ 
tion  and  projection  concepts  and  prove  the  Decomposition  theorem.  In  section  4, 
we  define  the  geometry  of  orthogonal  reduction  spaces,  and  prove  the  Independent 
Decomposition  theorem.  In  section  5,  we  relativize  the  geometry  w.r.t.  stable  sets  of 
results  5,  and  show  that  optimal  computation  w.r.t.  5  can  be  achieved  by  combining 
optimal  computations  of  5-independent  redex-sets.  Conclusions  appear  in  section  6. 
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2  Deterministic  Residual  and  Family  Structures 

Let  us  recall  some  ba^ic  theory  for  DRSs  and  DFSs  developed  in  [GK96,  KG96, 
KG97].  DRSs  are  Abstract  Reduction  Systems  (ARSs)  with  axiomatized  notions  of 
residual  A  definition  and  a  survey  of  results  about  ARSs  can  be  found  in  [Klo92j. 
Our  definition  is  slightly  different,  and  follows  that  of  Hindley  [Hin69]. 

An  ARS  is  a  triple  A  =  {Ter, Red, where  Ter  is  a  set  of  terms,  ranged  over 
hy  t,s,o,€;  Red  is  a  set  of  redexes  (or  redex  occurrences),  ranged  over  by  u,v,w:  and 

Red  {Ter  x  Ter)  is  a  (total)  function  such  that  for  any  t  €  Ter  there  is  only 
a  finite  set  of  u  €  Red  such  that  -»•  {u)  =  {t,  s),  written  t^s.  This  set  will  be  known 
as  the  redexes  of  term  t,  where  u  Ct  denotes  that  u  is  a  member  of  the  redexes  of 
t  and  U  Ct  denotes  that  C/  is  a  subset  of  the  redexes.  Note  that  one  can  identify  u 
with  the  triple  t^s.  A  reduction  is  a  sequence  t— ►^2“^ - 

P 

Notation  Reductions  are  denoted  by  P,Q,  N.  We  write  P  :t  s  ox  t  — >^sif 
P  denotes  a  reduction  (sequence)  from  t  to  s,  write  P  :  t  — h-  if  P  may  be  infinite, 
and  write  P  :  t  oo  if  P  is  infinite  (i.e,  of  the  length  w).  P  +  Q  denotes  the 
concatenation  of  P  and  Q.  u  also  denotes  the  reduction  that  contracts  u.  The  final 
term  of  a  finite  reduction  P  is  denoted  by  ft{P).  If  U  C  t,  then  U  will  denote  the 
complement  of  U,  i.e.,  the  set  of  redexes  in  t  not  in  U. 

DRSs  model  orthogonal  rewrite  systems.  They  are  similar  to  Stark’s  Determi¬ 
nate  Concurrent  Transition  Systems  (DCTSs)  [Sta89]  and  ARSs  of  Gonthier  et 
al.  [GLM92].  Unlike  DCTSs,  the  residual  relation  in  DRSs  may  be  duplicating,  and 
unlike  ARSs  of  [GLM92],  we  do  not  have  a  nesting  relation  on  redexes.  Several  re¬ 
fined  concepts  of  abstract  rewriting  are  studied  in  [Oos94,  Mel96,  Raa96]. 

Definition  2.1  A  DRS  is  a  pair  R  =  {A,  /),  where  A  is  an  ARS  and  /  is  a  residual 
relation  on  redexes  relating  redexes  in  the  source  and  target  term  of  every  reduction 
t-ll^s  G  A,  such  that  for  v  Ct,  the  set  v/u  of  residuals  of  v  under  w  is  a  set  of  redexes 
of  s;  a  redex  in  s  may  be  a  residual  of  only  one  redex  in  t  under  u,  and  uju  =  %.  If 
V  has  more  than  one  u-residual,  then  u  duplicates  v.  If  v/u  =  0,  then  u  erases  v.  A 
redex  of  s  which  is  not  a  residual  of  any  v  Ct  under  u  is  said  to  be  u-new  or  created 
by  u.  The  set  u/P  of  residuals  of  u  under  P  is  defined  by  transitivity. 

A  development  of  P  C  i  is  a  reduction  P  :  t  ^  that  only  contracts  residuals 
of  redexes  from  U;  it  is  complete  if  U/P  =  Uu^uu/P  =  0.  Development  of  0  is 
identified  with  the  empty  reduction.  U  will  also  denote  a  complete  development  of 
U  Ct.  The  residual  relation  satisfies  the  following  two  axioms: 

•  [FD]  ([GLM92])  All  developments  are  terminating;  all  complete  developments 
of  U  C  t  end  at  the  same  term;  and  residuals  of  a  redex  v  C  t  under  all  complete 
developments  of  U  are  the  same. 

•  [weak  acyclicity]  ([Sta89])  Let  u,v  C  t,  u  ^  v,  and  u/v  =  0.  Then  v/u  ^  0. 

We  call  a  DRS  R  stable  (SDRS)  if: 

•  [stability]  If  u,v  C  t  are  different  redexes,  t^e,  t^s,  and  u  creates  a  redex 
w  C  e,  then  the  redexes  in  w/{v/u)  are  not  -u/u- residuals  of  redexes  of  s,  i.e.,  they 
are  created  along  u/v. 

We  call  a  DRS  R  non- duplicating  or  affine  if  a  redex  may  have  at  most  one 
residual  under  contraction  of  another  redex.  Affine  SDRSs  will  be  called  ASDRSs. 
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In  a  DRS  7?.,  the  residual  relation  on  redexes  is  extended  to  all  co-initial  finite 
reductions  exactly  as  in  syntactic  orthogonal  rewrite  systems  [HL91,  LevSO,  Sta89j: 
(Pi  +  P2)/Q  =  Pi/Q  +  P2/(Q/Pi)  and  P/(Qi  4-  Q2)  =  (P/Qi)/Q2,  and 
equivalence  or  permutation-equivalence,  ^Li  Is  defined  as  the  smallest  relation  on 
co-initial  reductions  satisfying:  U  -i-  V/U  ^  +  U /V  for  any  U,V  C  t,  and  Q  ~£, 
Q'  P  +  Q  +  P  +  Q'  +  Further,  one  defines  P  <  Q  iff  P/Q  =  0,  and  can 
show  that  P  Q  iff  P  <  Q  and  Q  <  P;  and  P  <  Q  ifif  Q  P  +  for  some  N, 
The  following  Strong  Church-Rosser  property  can  be  proved:  for  any  co-initial  finite 
reductions  P,Q,  P  U  Q  Q  ^  P.  where  P  U  Q  =  P  +  Q/P. 

The  relations  <,  and  /  are  extended  to  co-initial  possibly  infinite  reductions 
N,  N’  as  follows.  N  <  N\  or  equivalently,  N/N'  =  0  if,  for  any  redex  v  contracted 
in  A^,  say  Af  =  Afi  +  u  -I-  N2,  vl{N'  jNi)  =  0;  and  N  N’  N  <  N'  and  N'  <  N. 
Here,  for  any  infinite  P,  w/P  =  0  (called  u  is  erased  in  P  oi  u  is  P -erased)  if 
ujP’  for  some  finite  initial  part  P'  of  P,  and  PjQ  is  only  defined  for  finite  Q, 
as  the  reduction  whose  initial  parts  are  residuals  of  initial  parts  of  P  under  Q. 

The  essence  of  stability  is  better  understood  by  the  following  lemma,  which  ex¬ 
tends  [stability]  axiom  from  one  step  reductions  to  any  co-initial  external  reductions, 
that  is,  reductions  that  do  not  contact  redexes  having  common  residuals. 

Definition  2.2  ([GK96])  •  Let  u  G  U  C  t  and  Pit  -h.  .  We  call  P  external  to  U 
(resp.  u)  if  P  does  not  contract  residuals  of  redexes  in  U  (resp.  residuals  of  u). 

•  Let  P  :  to  ^  and  Q  :  to  =  So  ^  ^  •  We  call  P 

external  to  Q  if  for  any  i,j,  UiKQjlPi)  n  VjKPi/Qj)  =  0. 

Lemma  2.3  (Stability  [GK96])  Let  P  :  t  -h-  s  be  external  to  Q  :  i  e,  in  an 
SDRS,  and  let  P  create  redexes  W  C  s.  Then  the  residuals  WHQIP)  of  redexes  in 
W  are  created  by  P/Q,  and  Q/P  is  external  to  W. 

Definition  2.4  ([KG96])  •  Let  P  :  t  and  u  C  t.  We  call  u  P -needed  if  there  is 
no  Q  P  that  is  external  to  u,  and  call  it  P-unneeded  otherwise. 

r)/ 

•  Let  Q  :  t  ^  ,  P  :  t  s  ,  and  u  C  s.  We  call  u  (or  more  precisely,  u 
with  creation  history  P',  denoted  by  P'u)  Q-needed  if  u  is  Q / P' -needed.  We  call  P 
Q -needed  if  so  is  every  redex  contracted  in  P. 

•  We  call  P  self-needed  or  standard  if  it  is  P-needed.  We  write  Q  «5  P  if  Q  P 
and  Q,P  e  ST  A,  where  ST  A  denotes  the  set  of  all  standard  reductions.  We  call  N 
a  standard  variant  of  P  if  P  ~  A/"  and  N  €  ST  A. 

Note  that  P-neededness  does  not  depend  on  the  choice  of  a  reduction  in  the  class 
of  reductions  Levy-equivalent  to  P,  but  this  is  not  true  for  the  externality  concept. 

The  following  is  a  relativized  standardization  algorithm  for  reductions  in  AS- 
DRSs.  Let  P,Q  it  ■  The  canonical  P-needed  variant  of  Q,  STp{Q),  is  defined  as 
follows:  let  u  C  t  be  such  that  it  is  P-needed  and  its  residual  is  contracted  in  Q  first 
among  P-needed  residuals  of  P-needed  redexes  in  t.  Then  STp{Q)  =  v-\-STpfy{Q/v). 
If  there  is  no  such  a  redex  in  t,  then  STp{Q)  =  0.  We  write  ST{P)  for  STp{P). 

The  Standardization  theorem  [KG96],  when  restricted  to  ASDRSs,  states  that, 
for  co-initial  reductions  Q,P,  finite  or  infinite,  STp{Q)  is  a  standard  P-needed 
reduction  whose  length  coincides  with  the  number  of  P-needed  steps  in  Q,  and 
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STpiQ)  <  Q,p>  Further,  if  Q  is  finite,  then  Q  ST{Q);  otherwise,  Q  ST{Q) 
need  not  hold. 

It  hcLS  been  shown  in  [KG97]  that,  in  ASDRSs,  all  standard  variants  of  a  finite 
reduction  P  can  be  constructed  effectively  (as  P-neededness  is  decidable  and  there 
are  only  a  finite  number  of  such  reductions,  all  of  the  same  length),  and  that  «s  is 
decidable.  So  standard  reductions  can  be  used  as  canonical  representatives  of  their 
Levy-equivalence  classes  (which  may  have  an  infinite  number  of  elements). 

Next  we  recall  an  axiomatization  of  Levy’s  concept  of  redex- family  for  DRSs. 
All  family  and  sharing  concepts  for  orthogonal  reduction  systems  known  to  us  (such 
as  [LevSO,  Mar92,  AL94,  Oos96])  satisfy  our  family  axioms,  which  allow  for  abstract 
proofs  of  Relative  Normalization  and  Optimality  Theorems  [GK96]. 

Definition  2.5  ([GK96])  A  Deterministic  Family  Structure  (DFS)  is  a  triple  F  = 
{TZ,  ^),  where  77.  is  a  DRS;  ~  is  an  equivalence  relation  on  redexes  with  histories; 
and  ^  is  the  contribution  relation  on  co-initial  families,  defined  as  follows: 

(1)  For  co-initial  reductions  P  and  Q,  a  redex  Qv  in  the  final  term  of  Q  (read  as  u 
with  history  Q)  is  called  a  copy  of  a  redex  Puif  P  <  Q,  i.e.,  P  +  QfP  Q,  and  v  is 
a  <5/ P-residual  of  u;  the  zig-zag  relation  is  the  symmetric  and  transitive  closure 
of  the  copy  relation.  The  family  relation  ~  is  an  equivalence  relation  among  redexes 
with  histories  containing  ^2.  A  family  is  an  equivalence  class  of  the  family  relation; 
families  are  ranged  over  by  . . ..  Fam{  )  denotes  the  family  of  its  argument. 

(2)  Further,  ~  and  satisfy  the  following  axioms: 

•  [initial]  Let  u,v  C  t  a,nd  u  ^  v,  in  7^.  Then  Pam(0f^i)  ^  Fam(0tu),  where  0i 
is  the  empty  reduction  starting  from  t. 

•  [contribution]  ^  ^  (j>'  iff  for  any  Pu  ^  •,  P  contracts  at  least  one  redex  in  0. 

•  [creation]  if  e  t^s  and  u  creates  u  C  s,  then  Fam{Pu)  ^  F am{{P 4*u)t;). 

•  [FFD]  {Finite  Family  Developments)  Any  reduction  that  contracts  redexes  of 
a  finite  number  of  families  is  terminating. 

It  is  shown  in  [GK96]  that  every  DFS  is  a  stable  DRS.  Further,  we  have  proven 
in  [KG97]  that  the  zig-zag  relation  as  well  as  the  zig-zag  contribution  relation 
are  decidable  in  ASDRSs,  and  that  is  a  family  relation. 

*^Below,  FAM{P)  (resp.  SFAM{P))  denotes  the  set  of  zig-zag  families,  or  simply 
families,  whose  member  (resp.  P-needed)  redexes  are  contracted  in  P,  in  an  ASDRS. 
Further,  for  any  U  C  t,  FAMq{U)  denotes  the  set  of  families  (relative  to  t)  of  redexes 
in  C/,  and  PAM+(P)  will  denote  the  minimal  set  of  families  containing  FAMo{U) 
and  closed  under  the  contribution  relation  '^2. 


3  Decomposition  of  Reductions  in  ASDRSs 

In  this  section,  we  introduce  restriction  of  a  reduction  to  a  redex-set,  and  its  pro¬ 
jection  onto  another  reduction,  study  their  properties,  and  use  them  to  decompose 
reductions  as  the  sum  of  their  restrictions  to  non-overlapping  redex  sets. 

Let  P  :  i  -H  be  a  reduction  in  a  DRS,  and  let  P  C  i  ^e  a  set  of  redexes  in  t. 
We  call  P  internal  to  C/  or  a  U -reduction  if  it  is  external  to  17,  that  is,  if  it  contracts 
residuals  of  redexes  in  U  and  created  redexes.  We  call  such  redexes  U-redexes. 

Definition  3.1  We  call  STp{ST{Q))  the  projection  of  Q  onto  P,  written  Q\P. 
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Definition  3.2  (1)  Let  t  he  a.  term  in  an  ASDRS  7^,  let  U  C  £,  and  let  P  :  £  s 
be  standard.  The  concepts  P  respects  U  and  the  restriction  of  P  to  U ,  written  P\U , 
are  defined  by  induction  on  n  =  \P\  as  follows.  If  n  =  0,  then  P  respects  C^and 
P\U  =  0.  Now  let  P  =  P'  +  w  and  let  P'  respect  U.  Assume  that  P'|I7  and  P'|P  ^e 
defined  as  reductions  internal  to  U  and  P,  respectively,  such  that  P'  P'\UuP'\U. 
Then  we  say  that  P  respects  U  if  either  u  =  u' /{P'\U/ P'\U)  for  u'  C  ft{P'\U)  sudi 
that  (P'|P)  +  u'  is  still  internal  to  U,^  u  —  u* /{P'\U/ P'\U)  for  u'  C  ft{P'\U) 
such  that  (P'|P)  +  u'  is  still  internal  to  U,  In  the  first  case  (depicted  on  the  picture 
below),  we_define  PIP  =  P'\U  +  u'  and  P\U  ==  P'\U,  and  define  P|P  =  P'|P  and 
p\U  =  P'|P  +  u'  in  the  second  case. 


P'lU 


P'\U/P'\U  ^ 


(2)  We  say  that  a  finite  reduction  Q  respects  U  if  so  does  ST{Q),  and  define  Q\U  — 
ST{Q)\U.  We  say  that  Q  respects  ^  if  if  respects  every  P^. 

One  can  easily  show  that  Definition  3.2  is  correct,  that  is,  ST{P)  P|PuP|P . 
The  intuition  is  that,  P  respects  P  iff  ST{P)  co^racts  only  redexes  to  which  either 
only  redexes  in  P  contribute,  or  only  those  in  P,  but  not  redexes  in  both  P  and 
P.  More  precisely,  if  P  C  £  and  P  :  £  -^  s,  then  P  respects  U  iff  SFAM{P)  C 
FAM'^{U)  U  PAM'^(P),  and  in  the  latter  case,  (5)PAM(P|P)  =  SFAM{P)  n 
FAM+lu)  and  5PAM(P)  =  {S)FAM{P\U)  U  {S)FAM{P\U).  Further,  if  P,  Q  are 
co-initial,  then  (S)FAM{STp{Q))  =  {S)FAM{Q)  n  SFAM{P).  It  follows  that  the 
restriction  and  projection  concepts  for  finite  reductions  are  invariant  under  «£,. 

In  the  above  definition,  we  need  to  take  a  standard  variant  of  Q  before  restricting 
it  to  P  to  ensure  that  the  restriction  notion  is  invariant  under  Levy-equivalence.  As 
shown  by  the  following  simple  example,  this  is  necessary.  Let  R  =  {f{^)  o-t 
x},  let  P  :  f{g{x))^f{x)^a,  and  let  P  =  {v}.  Then  ‘direct  restriction’  of  P  to  P  is 
u,  while  P|P  =  5T(P)|P  :=  0,  and  u  0. 

We  call  a  P- reduction  P  U -fair  if  each  P-redex  is  erased  in  P,  and  call  strongly 
U -cofinal  if,  for  any  P-reduction  Q,  Q  <  P.  If  P  is  the  set  of  all  redexes  in  £,  then 
P-fair  reductions  are  fair,  and  strongly  P-cofinal  reductions  will  be  called  strongly 
cofinal  One  can  show  that  a  P-reduction  P  is  strongly  P-cofinal  iff  it  is  P-fair. 
(Recall  that  if  P  is  fair,  then  it  is  cofinal,  but  not  conversely  [KloSO].) 

If  Q\P  is  finite,  then  so  is  P\Q  and  P\Q  Q\P-  Further,  for  any  P  :  £  -^ 
internal  to  P  C  £  such  that  SFAM{P)  =  PAM'^(P),  and  any  finite  Q  :  t  s, 
Q\U  =  Q\P.  U  the  SDRS  is  linear,  then  every  P-fair  reduction  is  such. 

Let  Ui  with  z  €  /  be  nonempty  sets  of  redexes  in  £  such  that  U^e/Pi  contains 
each  redex  of  £  and  Ui  Pi  Pj  —  0  when  i  ^  j.  Then  we  call  the  set  O'  =  {Pi}ig/  a 
(redex-) covering  of  £. 

The  restriction  concept  enjoys  nice  algebraic  properties:  If  P  :  £  — ^  s  respects 
Pijp2  Q  t,  then  P  respects  Pi  U  P2  and  Pi  fl  P2;  and  P|Pi  U  P2  ~s  R|Pi  tJ  P|P2 
and  P|Pi  n  P2  (P|Pi)|p2.  This  allows  us  to  prove  the  following 
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Theorem  3.3  (Decomposition)  Let  0=  =  {Ui}iei  be  a  redex-covering  of  a  term  t 
in  an  A  SDRS  TZ. 

(1)  Let  Pi  be  finite  reductions  internal  to  Ui^  and  let  P  =  UiPi.  Then  P  respects 
O  and  P\Ui  «s  P\Pi- 

(2)  Let  P  :  t  s  respect  Then  P  UiP\Ui. 

4  The  Geometry  of  Reduction  Spaces 

In  this  section,  we  introduce  the  Reduction  Geometry  and  prove  the  Independent 
Decomposition  theorem,  which  reflects  the  main  analogy  of  orthogonal  reduction 
spaces  with  the  Euclidean  Geometry. 

Let  P  :  i  -H>  .  We  call  the  strict  domain  of  P,  written  SDom{P),  the  minimal 
set  of  redexes  U  Ct  such  that  P  is  internal  to  U.  We  call  the  domain  of  P,  written 
Dom[P),  the  set  U(5~^p5Dom(Q),  i.e.,  the  minimal  set  of  redexes  U  Ct  such  that 
any  Q  that  is  Levy-equivalent  to  P  is  internal  to  U .  And  we  call  the  minimal  domain 
of  P,  written  MDom{P)i  the  set  nQ~^pSDom{Q). 

It  is  easy  to  see  that  Dam{P)  is  SDom{P)  augmented  by  all  P-erased  redexes 
not  contracted  in  P,  and  MDom(P)  is  the  set  of  all  P-needed  redexes  in  t.  Ob¬ 
viously  P  Q  implies  Pom(P)  =  D(mi{Q)  and  MDom(P)  =  MDom{Q),  but 
not  SDom{P)  =  SDcmi{Q).  It  follows  from  the  Standardization  Theorem  that 
MDom{P)  =  SDom{ST{P))  for  any  P. 

Definition  4.1  (The  Reduction  Geometry)  Let  t  be  a  term  in  an  ASDRS  'll. 

•  Let  P  :  t  s  and  Q  :  t  e.  We  say  that  P  and  Q  are  independent  or  do 
not  interact,  written  P  ±Q,ii  MDom{P)  H  MDom{Q)  =  0  and  any  created  redex 
in  ft{P  U  Q)  is  a  residual  of  a  redex  either  from  ft{P)  or  from  ft{Q). 

m  We  call  a  set  iJ  =  {Pi}i€/  reductions  starting  from  t  independent  if  P/  _L 
for  every  i  €  /  and  any  finite  initial  parts  P/  of  Pi.  We  call  U  a  basis  of  'll 
at  £  if  77  is  independent  and  for  any  Pit  —^s,P<  UPi. 

•  The  distance  |P,Q|  between  co-initial  finite  reductions  P,Q  :  t  ^  is  the 
number  of  families  whose  essential  member  redexes  are  contracted  either  in  P  or 
in  Q  (but  not  in  both).  Here  a  redex  v  C  s  is  essential  [Kha93]  (or  Maranget- 
needed  [Mar92])  if  in  any  fair  reduction  starting  from  s  a  residual  of  v  is  contracted. 

•  The  independence  degree  of  P  C  £  is  the  length  of  a  shortest  finite  P  internal 

to  U  such  that  there  exists  a  reduction  Q  external  to  U  that  interacts  with  P,  and 
is  oo  otherwise.  _ 

•  We  call  U  C  t  independent  if  every  pair  of  finite  U-  and  P-reductions  is  so. 
We  call  a  redex-covering  ^  =  {Pijie/  of  t  an  independent  covering  if  each  Ui  is 
independent. 

Example  4.2  (Bases)  Consider  a  term  t  containing  three  redexes  u,  v,  w,  let  wj {uU 
v)  0,  w/u  ^  0,  w/u  ^  0,  and  assume  no  redexes  can  be  created  by  contraction  of 
these  redexes.  Then  Ui  =  {u,  u},  772  =  ^1?  -^3  =  wUu}  and  =  {‘u,uU'u;} 

are  all  bases  at  £  (there  are  others  too),  as  all  reductions  are  independent,  and 
uUv  uUvUw  uU(u;Uu)  iiU(uLJit;)  are  all  normalizing,  hence  strongly 
cofinal.  For  Pi,  the  strict  domains  of  the  axes  do  not  form  a  covering  of  £,  while 
for  other  bases  they  do.  Note  also  that  for  P4,  u  erases  the  second  step  ofuLlw- 
{wlv)/(u/v)  ~  0. 
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Note  that,  in  the  definition  of  P  ±  Q,  a  created  redex  in  ft(P  U  Q)  cannot 
be  a  residual  of  redexes  from  both  ft{P)  and  ft{Q),  as  otherwise  the  same  re¬ 
dex  would  be  a  residual  of  redexes  from  /^(ST(P))  and  ft{ST{Q)),  which  is  im¬ 
possible  by  the  Stability  Lemma  (since  MDom{P)  D  MDom{Q)  =  0  implies  that 
ST{P)  and  ST{Q)  are  external;  the  converse  implication  need  not  hold).  Note  also 
that,  if  P  ±  Q,  Dom(P)  n  Dom{Q)  =  0  need  not  hold.  Indeed,  consider  the  mod¬ 
ified  example  from  [LevSO]:  let  t  =  {Xx.KaixY))Kb,  where  Ka  =  Ax.a,  Kb  =  Xx.b, 
and  Y  =  {Xx.f{xx)){\x.f{xx)),  and  let  P  :  t—^Ka{KbY)-^Ka{Kb{fY))^Kab  and 

Q  :  t^{XxJ<a{x{fY)))Kb^iXx.a)Kb^  Then  Y  e  Dom(P),  Dom{Q),  but  Y  ^ 
MDom{P),  MDam{Q),  since  Y  is  not  needed  either  in  P  or  in  Q. 

In  the  definition  of  distance  between  reductions  P  and  Q,  one  might  think  that 
it  would  be  more  appropriate  to  consider  P  U  Q-needed  redexes  only.  The  following 
example  shows  that  the  distance  would  not  be  a  metric.  Indeed,  take  t  =  Kxu,  P  : 

Q  :  and  N  :  Then  \P,Q\  =  3  and  |P.iV|  = 

|iV,  Ql  =  1.  It  is  easy  to  check  that  our  distance  measure  on  finite  co-initial  reductions 
satisfies  the  triangle  inequality.  To  make  it  a  metric,  we  define  for  co-initial  finite 
reductions  P,Q,  P  «/  Q  iff  FFAM{P)  =  FFAM{Q),  where  FFAM{P)  denotes 
the  set  of  families  of  essential  redexes  in  P.  Clearly,  is  an  equivalence  relation, 
and  the  (co-initial)  reduction  space  quotiented  w.r.t.  it  is  a  metric,  as  |P,  Q|  =  0 
implies  P  «/  Q.  Note  that  but  not  conversely. 

The  independence  degree  of  P  C  t,  if  finite,  characterizes  the  minimal  amount  of 
work  that  can  be  performed  in  U  independently  from  the  rest  of  the  computation. 

It  follows  easily  from  Definition  3.2  and  Definition  4.1  that  P  C  t  is  independent 
iff  any  finite  reduction  P  :  t  -h«  s  respects  it.  Now,  using  Theorem  3.3. (2),  we  can 
prove  the  following 

Theorem  4.3  (Independent  Decomposition)  Let  ^  =  {Pi}ie/  be  an  indepen¬ 
dent  redex-covering  of  a  term  t  in  an  ASDRS  7^,  let  P  :  t  s,  and  let  Pi  be  Pi-fair. 
Then  P  UiP|Pi.  Further,  B  =  {Pi}i€/  is  a  basis  at  t,  and  there  are  reductions 
P/  ^  Pi  such  that  P  LtiP/. 

We  have  seen  in  Example  4.2  that  not  all  bases  are  of  the  form  described  in 
Theorem  4.3.  That  is,  if  {Pi}ier  is  a  basis  at  t,  P,  need  not  be  an  P^-fair  reduction  for 
some  independent  covering  ^  =  {Pz}i€/  of  t,  as  it  is  the  case  for  7Ti  (since  w/u  7^  0 
and  wjv  ^  0).  We  could  exclude  this  situation,  by  requiring  in  the  definition  of 
independence  of  P  C  t  that  for  any  pair  of  finite  reductions  P,  Q  respectively  internal 
and  external  to  P,  Q  does  not  erase  any  steps  of  P,  that  is,  |P|  =  |P/Q|.  We  have 
chosen  not  to  do  so,  since  also  in  the  relativized  bases  which  we  introduce  in  the 
next  section,  axes  do  not  need  to  be  maximal  reductions  on  their  strict  domains. 

Note  that  every  term  t  in  an  ASDRS  has  an  independent  redex  covering  -  {U{t)}, 
where  U{t)  is  the  set  of  all  redexes  of  t,  and  has  an  independent  basis  -  a  fair 
reduction  starting  from  t.  One  can  construct  finer  bases  from  existing  ones,  as  if 
Cl  =  {Pi}ze/  and  Cl'  =  {PjlieJ  are  bases,  then  d  n  Cl'  =  {P^  n  Pj}(ij)e(/,J)  is  a 
basis  too.  It  is  interesting  to  note  that  for  any  P  :  t  s  and  a  created  redex  ia  C  s, 
any  ‘smallest’  reduction  needed  to  create  u,  obtainable  by  extraction  of  Pu  [LevSO], 
which  for  ASDRSs  is  defined  in  [KG97],  is  internal  to  some  finest  independent  set 
of  redexes  in  t. 
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5  The  Optimal  Decomposition  Theorem 

Xext  we  show  that  an  optimal  computation  of  a  term,  w.r.t.  a  stable  set  S  of  results, 
can  be  decomposed  into  optimal  computations  of  its  5-independent  redex-sets. 

The  concepts  introduced  in  Definition  4.1  (independence  of  reductions  and  redex 
sets,  covering,  basis,  etc.)  immediately  relativize  w.r.t.  any  stable  set  5,  simply  by 
replacing  ‘independence’,  ‘covering’,  ‘basis’,  etc.  with  ‘5-independence’,  ‘5-covering’, 
‘5-basis’,  etc.,  respectively,  and  by  replacing  ‘(essential)  redex’  and  ‘reduction’  with 
‘5-needed  redex’  and  ‘5-needed  reduction’.  Recall  that,  for  any  set  of  terms  5,  a 
redex  u  C  t  S-needed  iff  at  least  one  residual  of  it  is  contracted  in  any  reduction 
from  t  to  a  term  in  5,  and  5  is  called  stable  if  (a)  it  is  closed  under  reduction  (this 
condition  can  be  relaxed  slightly),  and  (b)  any  step  entering  5  is  5-needed.  The  Rel¬ 
ative  Normalization  theorem  [GK96],  for  ASDRSs,  states  that  any  5-normalizable 
term  t  ^  S  contains  an  5-needed  redex,  any  5-needed  reduction  starting  from  t  is 
eventually  5-normalizing,  and  is  a  shortest  5-normalizing  reduction  starting  from  t. 

Let  C  t.  We  call  a  f/-reduction  F  :  t  -h>  (C/,  S)-fair  if  each  5-needed  ff-redex  is 
erased  in  P  {P  need  not  be  F-fair).  It  is  not  difficult  to  show  that,  if  ^  =  {Ui\i  €  I] 
is  an  5-independent  covering  of  an  5-normcdizable  term  t  ^  5,  in  an  ASDRS,  then 
F  :  t  — «■  s  is  an  5-normalizing  5-needed  reduction  iff  F^  =  P\Ui  :  t  Si  are 
(f/i,5)-fair  5-needed  C/i-reductions;  and  F  ia  an  optimal  (t/i,5)-fair  Fi-reduction 
iff  it  is  an  5-needed  (I7i,5)-fair  C/i- reduction.  Hence  we  have  from  the  Relative 
Normalization  theorem  that 

Theorem  5.1  (Optimal  Decomposition)  Let  5  be  a  stable  set  of  terms  in  an 
ASDRS  F,  let  ^  =  {Ui}i£i  be  an  5-independent  covering  of  an  5-normalizable  term 
t  in  7^,  let  0'  =  {Uj}jEJCl  contain  ail  Ui  that  contain  at  least  one  5-needed  redex 
of  t,  and  let  Pj  be  internal  to  Uj.  Then  Pj  are  optimal  (i.e.,  shortest)  (C/),5)-fair 
reductions  iff  F  =  UjPj  is  an  optimal  5-normalizing  reduction  starting  from  t. 


6  Conclusions 

We  have  defined  concepts  similar  to  those  in  Vector  Spaces  for  orthogonal  rewrite 
systems,  and  described  how  these  can  be  used  in  distributed  evaluation  of  sequential 
programs.  The  constructed  Reduction  Geometry  is  not  just  a  nice  piece  of  mathe¬ 
matics.  Obviously,  (relative)  independence  of  redex-sets  is  undecidable  in  general, 
as  is  neededness.  However,  we  hope  that  decidable  approximations  for  independence 
can  be  defined  which  will  yield  decidable  concepts  for  large  classes  of  rewrite  sys¬ 
tems,  as  is  the  case  for  the  neededness  [HL91].  For  example,  all  the  introduced  con¬ 
cepts  are  decidable  for  Recursive  Program  Schemes,  both  in  first  [Kha93]  and  higher 
order  [Kha94]  cases,  but  the  latter  do  not  have  full  computational  power  (as  the 
if  _  then  -  else  operator  is  only  evaluated  semantically).  Actually,  because  of  a 
specific  simple  form  of  redex- creation  in  such  systems,  one  has  maximal  possible 
independence  there  -  any  redex  forms  an  independent  redex-set.  Further,  TRSs  in 
which  there  is  no  upwards  creation  of  redexes  (such  as  Klop’s  TRS  which  models  a 
Turing  machine,  in  Exercise  2.2.21  of  [Klo92])  do  have  full  computational  power,  and 
any  set  consisting  of  all  redexes  occurring  inside  an  outermost  redex  is  independent. 
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Abstract 

Despite  the  major  role  that  modularity  occupies  in  computer  science,  all  the 
known  results  on  modular  analysis  only  treat  particular  problems,  and  there 
is  no  general  unifying  theory.  In  this  paper  we  provide  such  a  general  theory  of 
modularity.  First,  we  study  the  space  of  the  criteria  for  modularity  (the  so-called 
modularity  space),  and  give  results  on  its  complexity.  Then,  we  introduce  the 
notion  of  vaccine  and  show  how  it  can  be  used  to  completely  analyze  the  modular 
space.  It  is  also  shown  how  vaccines  can  be  effectively  used  to  solve  a  variety 
of  other  modularity  problems,  providing  the  best  solutions.  As  an  application, 
we  successfully  apply  the  theory  to  the  study  of  modularity  for  term  rewriting, 
giving  for  the  first  time  optimality  results,  and  show  how  modularity  problems 
can  be  completely  solved. 


1  Introduction 

The  field  of  modular  analysis  is  of  fundamental  importance,  and  is  nowadays  attracting 
increasing  interest  by  the  scientific  community.  In  essence,  modularity  allows  to  study 
a  complex  object  by  studying  his  smaller  subparts:  given  a  ‘big’  object  composed  by 
smaller  subparts  (via  some  composition  operator),  we  want  to  state  that  it  enjoys  a 
certain  property  by  simply  investigating  its  smaller  subcomponents.  Hence,  modular 
analysis  allows  to  develop  correct  complex  objects  ‘bottom-up’,  just  building  correct 
smaller  submodules,  and  even  dually  to  verify  the  correctness  of  a  complex  object  by 
decomposing  it  into  its  submodules  and  verifying  them. 

Besides  for  the  theoretical  relevance,  the  increasing  complexity  of  nowadays  appli¬ 
cations  has  made  modularity  analysis  a  task  of  primary  importance  from  the  practical 
side  as  well. 

At  the  present  moment,  the  field  of  modular  analysis  consists  of  several  results  that 
study  the  modularity  of  a  particular  property  for  a  certain  specific  paradigm  (see  e.g. 
[7,  2,  20,  13,  8,  16,  5,  18]).  However,  there  is  no  general  theory  on  modular  analysis. 
In  this  paper,  we  introduce  such  a  theory. 

Given  the  property  to  be  verified,  and  the  ‘composition  operator’  that  builds  com¬ 
plex  objects  from  smaller  submodules,  we  analyze  the  corresponding  modularity  space ^ 
that  is  to  say  the  collection  of  all  the  criteria  for  the  modularity  of  the  property  w.r.t. 
the  composition  operator. 

First,  a  complete  description  of  this  space  by  means  of  its  maximal  criteria  is  pro¬ 
vided  (roughly  speaking,  the  ‘best’  results  that  can  be  obtained),  and  its  complexity 
is  studied  (how  many  maximal  criteria  can  exist).  Next,  we  introduce  the  notion  of 
vaccine,  which  is  used  for  analyzing  in  an  effective  way  the  modularity  space.  Intu¬ 
itively,  a  vaccine  extracts  from  a  possibly  non-modular  property  a  maximal  modular 
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sub-property,  that  is  a  maximal  criterion  of  the  modularity  space  for  that  property. 
Therefore,  vaccines  provide  a  convenient  way  to  represent  the  modularity  space.  We 
propose  a  methodology  for  finding  vaccines  (and  so  the  optimal  modularity  criteria). 
Moreover,  we  provide  suitable  conditions  that  ensure  that  the  analysis  of  the  modular¬ 
ity  space  is  completely  solved,  i.e.,  it  covers  all  the  optimal  criteria,  and  consequently 
every  possible  modularity  criterion  (being  all  the  others  subsumed  by  the  maximal 
criteria). 

Furthermore,  it  is  shown  that  an  analysis  which  is  completely  solved,  is  relevant 
for  the  study  of  the  class  of  the  disjunctive  criteria  (cf.  [13,  20]),  because  it  provides 
the  best  disjunctive  criterion. 

Finally,  we  consider  also  the  other  side  of  the  coin,  namely  the  case  when  modularity 
does  not  hold.  We  introduce  the  notion  of  counterexample  structure,  which  is  used 
together  with  the  notion  of  vaccine  for  recovering  the  best  description  of  the  failure  of 
modularity.  The  above  results  are  successfully  applied  to  the  study  of  the  modularity 
of  important  properties  of  term  rewriting  systems:  termination,  completeness  and 
uniqueness  of  normal  forms  (the  only  main  properties  of  TRSs  that  are  not  modular). 
In  particular,  we  show  that  C^-termination  (cf.  [5,  15])  is  a  maximal  criterion,  and 
provide  a  formal  justification  in  terms  of  complexity  of  the  difficulty  of  the  study  of 
the  modularity  of  termination  in  TRS.  Moreover,  we  completely  solve  the  problem 
of  the  modularity  of  termination  for  left-linear  TRSs,  providing  the  only  two  optimal 
criteria.  We  give  analogous  results  for  the  other  major  properties  of  completeness  and 
uniqueness  of  normal  forms,  thus  not  only  improving  on  all  the  works  on  the  modularity 
of  these  properties,  but  completely  solving  the  problem  of  their  modular  analysis. 

The  paper  is  organized  as  follows.  Section  2  starts  with  some  short  preliminaries. 
Soon  afterwards.  Section  3  presents  the  notion  of  modular  analysis  and  of  a  criterion 
for  modularity.  Then,  Section  4  introduces  the  modularity  space  and  gives  some  results 
on  its  complexity.  In  Section  5  the  concept  of  vaccine  is  introduced.  Next,  Section 
6  shows  how  vaccines  can  be  successfully  employed  for  the  study  of  the  modularity 
space  via  the  notion  of  vaccines  basis.  Section  7  analyzes  another  kind  of  criteria, 
the  so-called  disjunctive  criteria,  and  shows  how  they  can  be  successfully  analyzed  via 
vaccines.  Section  8  performs  the  same  task  for  the  study  of  counterexample  structures, 
giving  a  complete  analysis  of  the  failure  of  modularity.  Sections  9  successfully  presents 
practical  applications  of  the  theory  for  the  field  of  term  rewriting.  Finally,  Section  10 
ends  with  some  other  remarks  on  the  further  applications  of  the  theory. 

2  Preliminaries 

O  denotes  the  class  of  generic  objects  we  will  consider:  every  object  is  understood  to 
be  in  O.  As  usual,  properties  of  objects  will  be  identified  with  the  classes  of  objects 
that  belong  to  them.  So,  we  will  write  equivalently  Q\  A  Q,^  or  Qi  fl  Q2  to  denote 
the  intersection  of  two  properties  Qi  and  We  will  also  write  —iQ  to  indicate  the 
complement  property  of  Q  (i.e.  T  G  -iQ  iff  T  ^  Q). 

As  far  as  TRSs  are  concerned,  we  only  require  knowledge  of  the  basic  notions  (see 
e.g.  [3,  7]).  The  reader  interested  in  modularity  topics  of  TRSs  can  find  extensive 
surveys  in  [14,  16]. 
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3  Modularity 

Suppose  we  want  to  perform  the  modular  (w.r.t.  some  composition  operator  O)  analysis 
of  the  property  V:  given  a  complex  object  TiO  •  •  •  QTn  we  want  to  infer  it  belongs  to 
V  by  separately  analyzing  its  smaller  submodules  Ti,. . .  ^Tn- 

The  best  case  occurs  when  the  property  V  is  modular  (w.r.t.  a  binary  composition 
operator  O):  whenever  n  objects  Ti , . . . ,  are  in  V,  their  composition  TiO  • . .  ©Tn  is 
in  V  as  well.  Thus,  to  check  a  complex  object  TiQ  . .  .Tn  belongs  to  7^,  it  just  suffices 
to  check  its  submodules  Ti, . . .  ,Tn  belong  to  P.  In  general,  however,  V  may  not  be 
modular,  and  so  we  need  a  more  general  concept  to  formalize  modular  analysis.  We  so 
define  what  is  the  notion  of  a  criterion  for  modularity: 

Definition  3.1  Q  is  a  criterion  (for  the  ^-modularity  ofV)\i  and  VTi, . .  .  ,Tn. 

Ti  e  Q,...,Tr,  e  Q^TiO...QTn  ev.  □ 

In  the  sequel  we  will  often  talk  simply  of  criterion,  omitting  V  and  0. 

So,  having  a  criterion  Q  we  can  perform  modular  analysis  of  a  complex  object 
Ti0 . .  .  QTn  just  by  separately  checking  that  every  submodule  belongs  to  Q. 


3.1  Assumptions 

Given  the  property  0)  whose  modular  behaviour  we  want  to  analyze,  we  call 

healthy  the  objects  in  V,  and  sick  the  others  (the  reasons  for  this  terminology  will 
become  clear  when  we  will  introduce  vaccines  in  Section  5).  We  say  that  two  objects 
A  and  B  are  compatible  (resp.  uncompatible)  w.r.t.  V  and  0,  if  AQB  is  healthy  (resp. 
sick). 

Since  the  observable  of  interest  is  the  property  V,  we  introduce  the  following  notion: 
two  objects  A  and  B  are  said  to  be  V  -equivalent  (A  —v  B)  ii  A  B 

Recall  from  algebra  that  a  groupoid  («S,  r)  is  a  set  S  equipped  with  a  binary  op¬ 
eration  T.  Although  this  is  not  strictly  needed  for  the  development  of  our  theory,  for 
simplicity  we  suppose  that  in  every  groupoid  we  talk  about  there  is  a  neutral  element 
(if  it  is  not  the  case,  one  can  always  be  added  by  the  standard  lifting  technique). 

We  say  that  a  groupoid  (5,  r)  is  a  V -semilattice  if  for  every  objects  A,  B  and  C  in  S 
we  have  that  [At  B)tC  =-p  At{BtC),  AtB  =-p  Bt  A^  and  At  A  A.  That  is  to  say, 

a  'P-semilattice  is  like  a  semilattice,  but  for  the  fact  that  the  equations  for  associativity, 
commutativity  and  idempotence  are  weakened  by  considering  -equivalence  in  place 
of  equivalence. 

Another  crucial  definition  is  the  following: 

Definition  3.2  A  groupoid  («?,t)  is  said  to  be  V -dense  if  VTi,T2  £  S.  Ti  r  T2  E 
V^TieVAT2£V.  □ 

Roughly  speaking,  density  corresponds  to  the  very  reasonable  assumption  that 
objects  constituting  a  healthy  object  are  themselves  healthy. 

Now  we  have  all  the  ingredients  to  define  this  main  notion: 

Definition  3.3  A  V-acid  groupoid  (briefly,  a  V-acid),  is  a  groupoid  («S,r)  that  is  a 
P-dense  P-semilattice.  ^ 

The  name  “acid”  stems  from  the  fact  a  semilattice  can  equivalently  be  seen  as  an 
aci-groupoid  (viz.  a  groupoid  that  is  associative,  commutative  and  idempotent),  and 
so  acid  stands  for  aci  and  dense. 

Assumption:  Throughout  the  paper,  we  assume  that  {0,0)  is  a  P-acid. 
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We  remark  that  for  most  of  the  results  all  of  the  above  assumptions  are  not  neces¬ 
sary.  We  take  all  of  them  at  once  to  simplify  readability  (for  discussions  on  the  minimal 
required  hypotheses,  see  e.g.  [10,  12]). 

4  The  Modularity  Space 

The  study  of  modularity  for  a  given  healthiness  property  is  tantamount  to  the  study  of 
the  criteria  for  its  modularity.  We  are  so  interested  in  the  modular  space  (m-space),  that 
is  in  the  collection  of  all  the  criteria  for  modularity.  A  way  to  express  this  information 
is  to  consider  only  the  most  significant  objects  in  this  space.  The  m-space  has  a  natural 
partial  ordering,  namely  the  set  inclusion;  the  idea  is  so  to  consider  only  the  tops  of 
the  m-space: 

Definition  4.1  The  modular  basis  (m-basis  for  short)  is  the  collection  of  all  the 
maximal  criteria.  The  modular  dimension  (^m- dimension)  is  the  cardinality  of  the  m- 
basis.  ^ 

The  modular  basis  is  a  good  representative  of  the  modular  space,  since  from  it 
we  can  build  up  the  whole  modular  space  (the  maximal  criteria  entail  all  the  other 
criteria): 

Theorem  4.2  Every  criterion  is  contained  in  a  maximal  criterion. 

4.1  fc-counterexamples 

The  m-dimension  gives  an  abstract  measure  of  the  complexity  of  the  modular  space. 

It  is  not  difficult  to  see  that  the  m-dimension  is  one  iff  V  is  modular,  and  if  V  is  not 
modular  the  m-dimension  is  at  least  two.  We  now  give  more  precise  results  on  the 
m-dimension,  introducing  the  concept  of  ^-counterexample. 

Given  an  ordinal  k,  a  k- counterexample  (to  the  ©-modularity  of  V)  is  a  collection 
Ai, . . . ,  Afc  of  pairwise  uncompatible  healthy  objects. 

Usually,  a  2-counterexample  will  be  simply  called  a  counterexample. 

The  next  two  lemmata  provide  the  link  between  /c-counterexamples  and  the  m- 
dimension.  The  first  result  gives  a  lower  bound: 

Lemma  4.3  If  there  is  a  k- counterexample  (k<Lj),  then  the  m-dimension  is  at  least  k. 

The  second  result,  dually,  gives  an  upper  bound: 

Lemma  4.4  If  there  is  not  a  k- counterexample  (k  <  uj),  then  the  m-dimension  is  less 
than  k. 

Combining  the  above  bounds  gives  the  following  characterization  of  the  m-dimension 
in  the  finite  case: 

Corollary  4.5  The  m-dimension  is  k  (k  <  uj)  iff  there  is  a  k- counterexample  but 
there  is  no  fc  -f  1- counterexample. 

5  Vaccines 

We  said  the  basic  notion  of  the  theory  is  that  of  vaccine.  A  vaccine  is  “a  preparation 
of  living  attenuated  organisms,  or  living  fully  virulent  organisms  that  is  administered 
to  produce  or  artificially  increase  immunity  to  a  particular  disease^''  (Webster’s  7th 
Collegiate  Dictionary).  So,  suppose  we  want  to  ensure  an  organism  enjoys  a  particular 


property.  We  can  inject  a  specific  vaccine  for  this  property  to  it:  if  it  does  not  get  sick, 
due  to  collateral  effects,  we  are  sure  it  is  immunized  and  enjoys  that  property. 

In  this  paper,  we  utilize  the  notion  of  vaccine  in  a  formal  setting  to  study  mod¬ 
ularity.  Therefore,  suppose  we  want  to  study  the  modularity  behaviour  of  the  class 
of  objects  V.  The  idea  is  to  consider  'P  as  a  ‘healthiness  condition',  and  select  some 
representative  objects  that  make  things  go  wrong  (i.e.  that  cause  modularity  to  fail), 
using  them  as  a  vaccines:  we  can  ‘inject’  one  of  them,  say  A,  to  any  other  object  in  V 
via  the  composition  operator  O:  in  case  there  are  no  collateral  effects,  i.e.  in  case  the 
object  is  still  healthy  (belonging  to  V),  it  will  become  ‘immunized’  to  that  particular 
disease  that  made  modularity  fail. 

More  formally,  an  object  ^  is  a  vaccine  if  for  the  class  of  its  vaccinated  objects 
{{T:TQA  e  V}),  V  becomes  ©-modular. 

The  nice  fact,  as  said  in  the  introduction,  is  that  we  will  show  that  the  criteria 
defined  by  vaccines  are  optimal  (i.e.  maximal).  This  way,  vaccines  provide  a  tool  to 
completely  describe  the  modular  space,  providing  the  best  criteria. 

We  now  start  giving  rigorous  formal  definitions. 

Definition  5.1  The  class  of  objects  vaccinated  via  A  with  injection  operator  0  and 
healthiness  property  V  is 

Y^{V)  =  {T:TqA  ev}  □ 

That  is,  we  take  every  object  T  and  inject  A  to  it,  obtaining  the  healthy  object 
TQA. 

The  operator  0  and  the  healthiness  property  V  will  be  mostly  omitted  and  con¬ 
sidered  understood,  hence  we  will  also  write  simply  Va. 

Now,  we  can  define  what  a  vaccine  for  modularity  is: 

Definition  5.2  A  is  a  vaccine  (for  the  (^-modularity  ofVjiiWA  is  a  criterion  for  the 
©-modularity  of  P.  ^ 

That  is  to  say, 

0/ Va,  Ti  G  VA,...,Tfc  e  VA^Ti0...0Tfc  ev 

Vaccines  can  be  composed  to  get  new  vaccines,  as  the  following  results  show: 

Lemma  5.3  (Composition)  Suppose  A  is  a  vaccine  for  Vi  and  B  is  a  vaccine  for 
V2-  If  AQB  eVi  a  V2,  then  AOB  is  a  vaccine  for  Vi  A  V2. 

Corollary  5.4  If  A  and  B  are  compatible  vaccines,  then  A<S)B  is  a  vaccine. 

Vaccines  are  only  representatives  of  the  corresponding  criteria.  It  is  therefore  im¬ 
portant  to  ask  when  different  vaccines  are  representative  of  the  same  class.  The  fol¬ 
lowing  lemma  gives  a  neat  answer  to  this  question: 

Lemma  5.5  Let  A  and  B  be  vaccines.  Then,  Va  =  Vb  A  and  B  are  compatible 

6  Vaccines  Bases 

Every  vaccine  for  modularity  defines  a  criterion  for  modularity  given  by  the  class  Va. 
The  most  important  reason  that  makes  vaccines  attractive  to  study  is  that  this  criterion 
is  optimal  in  the  sense  that  cannot  be  improved . 

Theorem  6.1  (Optimality)  If  A  is  a  vaccine,  then  Vk  is  a  maximal  criterion. 
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The  m-basis  is  an  abstract  concept.  Anyway,  we  have  just  seen  that  vaccines 
can  conveniently  represent  the  maximal  criteria.  So,  we  introduce  a  new  manageable 
representative  of  the  m-space: 

Definition  6.2  A  vaccines  basis  (v-hasis)  is  a  collection  of  vaccines  {Ai}i=i...fc  {k  an 
ordinal)  such  that  every  maximal  criterion  is  represented  by  exactly  one  vaccine.  □ 

Hence,  Ai , . .  .  ,  is  a  v-basis  iff  , . .  • ,  is  the  m-basis. 

A  v-basis  does  not  only  give  a  complete  description  of  the  modular  space.  It  also 
allows  to  easily  derive  that  a  property  is  indeed  a  criterion  by  proving  that  it  is  weaker 
than  an  optimal  criterion.  The  precise  technique  is  described  in  the  full  paper.  This 
also  holds  for  the  other  kind  of  criteria,  namely  d-criteria  (cf.  Section  7).  Hence  not 
only  easy  proofs  of  the  previously  existing  results  on  modularity  can  be  given,  but  also 
investigation  of  new  practical  criteria  is  possible. 

6.1  v-Bases  versus  ^-Counterexamples 

We  now  analyze  the  tight  relationships  between  v-bases  and  fc-counterexamples.  First 
we  introduce  the  notion  of  partial  v-basis,  which  formalizes  the  uncomplete  knowledge 
of  a  v-basis. 

Definition  6.3  A  partial  vaccines  basis  is  a  collection  Ai,...,Afc  (fc  an  ordinal)  of 
vaccines  giving  pairwise  different  maximal  criteria.  ^ 

Lemma  6.4  Every  partial  vaccines  basis  {Ai, . . . ,  Afc}  is  a  k- counterexample. 

As  a  corollary,  we  get  that  every  v-basis  {Ai, . . . ,  Afc}  is  a  fc-counterexample.  The 
next  important  result  shows  that  also  the  other  direction  holds,  thus  providing  a  way 
to  find  the  v-bases: 

Theorem  6.5  If  the  modular  dimension  isk  <uj,  then  every  k- counterexample  is  a 
v-basis. 

Combining  these  results,  we  get  the  following  characterization  of  the  v-bases: 

Corollary  6.6  (Characterization)  If  the  modular  dimension  is  k  <  uj,  then  the 
v-bases  are  exactly  the  k- counterexamples. 

Therefore,  the  above  results  suggest  a  way  to  find  the  optimal  criteria:  seek  for 
vaccines  produced  by  objects  in  A:-counter examples. 

In  fact.  Theorem  6.5  says  much  more:  if  we  know  that  the  m-dimension  is  fc  <  w 
(e.g.  via  Corollary  4.5),  then  a  v-basis  is  automatically  provided  by  a  /c-counterexample. 

Another  immediate  consequence  of  Theorem  6.5  is  about  the  existence  of  v-bases. 
Corollary  6.7  If  the  modular  dimension  is  k  <u),  there  is  a  v-basis. 

In  order  to  effectively  find  a  v-basis.  Theorem  6.5  requires  the  knowledge  of  the 
m-dimension,  which  as  said  can  be  computed  using  Corollary  4.5.  Anyway,  there  is 
another  fundamental  result  that,  starting  from  a  not  complete  knowledge  of  it  (a  partial 
v-basis),  ensures  that  we  have  found  a  v-basis: 

Theorem  6.8  (Covering)  Let  Ai, . . . ,  Afc  (k  <  uj)  be  a  partial  v-basis.  It  is  a  v-basis 
iff  every  healthy  object  belongs  to  at  least  one  Va^:  Ui^[i,fc]VA^  =  V  (i-e.  the  criteria 
‘cover’  the  healthy  objects). 
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The  above  theorem  thus  provides  an  alternative  powerful  methodology  to  find  a  v- 
basis;  build  up  a  ^-counterexample  with  k  as  great  as  possible;  prove  that  its  elements 
are  vaccines  (Theorem  6.5);  check  if  the  criteria  cover  the  healthy  objects  (Theorem 
6.8). 

We  will  later  (Section  9)  successfully  employ  this  methodology  in  the  applications 
of  the  theory  to  term  rewriting. 

7  Disjunctive  Criteria 

The  notion  of  criterion  for  modularity  that  we  have  given  in  Definition  3.1  is  not  the 
only  one  which  has  been  studied.  Another  kind  of  criteria,  e.g.  studied  in  [13,  20], 
requires  only  one  of  the  objects  to  be  constrained  in  order  to  ensure  their  combination 
is  healthy.  So,  we  introduce  this  concept: 

Definition  7.1  Q  is  a  disjunctive  criterion  (for  the  Q-modularity  ofV),  or  d-criterion 
for  short,  if  VTi,...,Tn.Ti  €  Q  V  ...  VT„  e  Q=>Ti0...0r„  eV.  □ 

The  motivation  for  the  adjective  ‘disjunctive’  should  be  clear  from  the  definition; 
analogously,  the  usual  criterion  of  Definition  3.1  could  be  dubbed  ‘conjunctive’. 

Unlike  the  standard  criteria,  the  d-criteria  space  is  linearly  ordered,  since  only  one 
object  instead  of  all  objects  is  constrained.  The  following  definition  formalizes  the  top 
object  in  this  space: 

Definition  7.2  The  kernel  K  is  the  greatest  disjunctive  criterion,  that  is  /C  =  {T  e 
V  :  VT’  e  V.  TOT'  eV  3  T'OT}.  □ 

It  is  easy  to  prove  that,  rather  interestingly,  the  kernel  has  an  important  algebraic 
meaning,  since  it  is  just  the  class  of  ='p  -neutral  elements  (i.e.  those  elements  N  such 
that  for  every  T  we  have  TQN  =-p  T  NQT). 

Nicely,  from  a  v-basis  we  can  obtain  right  away  the  kernel: 

Theorem  7.3  Suppose  {Ai}i-i...fc  is  a  vaccines  basis.  Then  the  kernel  is  ni=i...fcV4^ . 

8  Counterexample  Structures 

In  this  section  we  turn  our  attention  to  the  other  side  of  the  coin:  when  modularity 
fails.  We  formally  study  what  happens  when  two  objects  give  a  counterexample  to 
modularity. 

Definition  8.1  A  couple  of  classes  {Qi,  Q2}  is  a  counterexample  structure  {c- structure)^ 
(w.r.t.  O  and  V)  if  in  every  counterexample  one  of  the  two  objects  belongs  to  Qi  and 
the  other  to  Q2-  ^ 

The  canonical  ordering  on  structures  is:  {Qi,Q2}  Qstruct  {Q\-,Q'2]  iff  (2i  ^ 
Q'\  A  Q2  C  Q2)  V  (Qi  C  02  A  Q2  C  Q[).  Then,  we  say  that  a  structure  {0i,  Q2}  is 
better  than  another  structure  {Qi,  Q2}  if  {Qi,  Q2}  Cstruct  {Qi,  Q'2}''  this  means  we 
can  provide  with  {Qi,  Q2}  a  more  precise  (smaller)  description  than  with  {Qi,  Qi}- 
The  best  structure  is  so  the  minimum  w.r.t.  Cgtruct- 

From  a  v-basis  we  can  recover  the  best  counterexample  structure,  as  the  next  result 
shows: 

Theorem  8.2  If  {Ai,  A2}  is  a  vaccines  basis,  then  {"’Yai  AP}  is  the  best 

counterexample  structure. 

Actually,  more  can  be  proved,  i.e.  that  such  c-structure  is  perfect  in  the  sense  that 
it  provides  a  characterization  of  the  counterexamples  (cf.  [12]). 

Analogous  results  can  be  stated  for  v-bases  of  higher  dimension. 
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9  Applications  to  Term  Rewriting 

We  now  provide  some  applications  of  the  theory  to  the  study  of  the  modularity  of 

termination  for  Term  Rewriting  Systems. 

So,  we  let  O  =TRSs  and  consider  as  usual  the  combination  operator  O  to  be  the 
disjoint  sum  (©)  of  two  TRSs:  when  the  signatures  overlap  the  TRSs  are  renamed 
to  get  disjoint  signatures,  and  then  their  (disjoint)  union  is  taken.  The  healthiness 
property  is  V  ^Termination  (Termination  will  be  also  indicated  with  the  acronym  SN, 
after  Strong  Normalization).  We  have  that 
Lemma  9.1  {TRSs,  is  SN-acid. 

Among  the  many  results  on  the  modularity  of  termination  (see  e.g.  [14,  8,  16, 
18]  for  a  panoramic),  the  best  results  so  far  obtained  are  the  ones  in  [15]  and  [9]. 
We  will  come  back  to  the  result  of  [9]  in  the  next  subsection.  In  [15]  Ohlebusch, 
generalizing  a  previous  result  of  Gramlich  for  finitely  branching  TRSs  ([5]),  proved  that 
‘Cf: -termination’  is  modular.  It  is  straightforward  to  see  that  the  class  of  C^-terminating 
TRSs  coincides  with  the  class  of  TRSs  vaccinated  via  {or(A:,  F)  -»•  X,  or{X,  Y)  Y}. 
This,  a  posteriori,  implies  that  the  above  TRS  is  a  vaccine  (for  the  modularity  of 
termination). 

Hence,  using  Theorem  6.1  we  obtain  right  away: 

Theorem  9.2  Cs -termination  is  a  maximal  criterion. 

That  is  to  say,  the  result  of  [15]  cannot  be  improved. 

But  what  is  the  complexity  of  the  modular  space  for  termination?  The  following 
result  gives  a  formal  confirmation  that  the  topic  is  quite  intricated: 

Theorem  9.3  The  m-dimension  is  at  least  three. 

The  proof  of  the  above  result  makes  use  of  Lemma  4.3. 

Whether  the  m-dimension  is  indeed  three,  is  still  one  of  the  most  important  open 
problems  (we  conjecture  it  is). 

9.0.1  The  Left-Linear  Case 

As  just  seen,  the  situation  for  termination  is  quite  complicated,  since  we  have  proved 
that  the  m-dimension  is  at  least  three,  and  only  one  vaccine  has  been  found  so  far.  In 
the  left-linear  case  we  will  be  able  to  completely  solve  the  problem,  finding  a  v-basis. 

There  are  two  best  results  on  the  modularity  of  termination  for  left-linear  TRSs. 
The  first  stems  from  the  one  seen  above:  in  the  left-linear  case,  {or(A',  T)  — ^  X, 
or{X,Y)  — ^  y}  is  a  vaccine. 

So,  by  Theorem  6.1  we  can  infer  that  Cf -termination  is  a  maximal  criterion  even 
for  left-linear  TRSs. 

The  second  is  the  result  proved  in  [9].  Recall  that  a  TRS  is  said  consistent  (with 
respect  to  reduction),  briefly  CON”",  if  no  term  reduces  to  two  different  variables.  In 
the  aforementioned  paper  it  has  been  shown  that  termination  is  modular  for  left-linear 
and  consistent  TRSs. 

We  have  seen  in  Section  4  that  there  are  deep  relationships  between  fc-counterexamples 
and  v-bases.  The  most  famous  counterexample  to  the  modularity  of  termination 
has  been  given  by  Toyama  in  [19]:  {F{0,1,X)  F{X,X,X)}  and  {or(A,y)  ^ 

X,or{X,Y)  Y}.  As  seen  above,  {or(A,y)  X,or{X,Y)  ^  y}  is  a  vaccine. 
Hence,  a  stimulating  hypothesis  is  that  {F(0, 1,A)  —^F{X,X,X)}  is  a  vaccine  as 
well.  Amazingly,  this  turns  out  to  be  true: 


668 


Theorem  9.4  For  left-linear  TRSs,  V{f{o,i,X)-^F{X,x,x)}  =  SN  A  CON“*. 

That  is  to  say,  the  class  of  left-linear  TRSs  vaccinated  by  {F(0, 1,  X)  — >  F{X,  X,  X)} 
is  just  the  criterion  found  in  [9]. 

Corollary  9.5  In  the  left-linear  case,  {f(0,l,X)  — >  F{X,X,X)}  is  a  vaccine. 

Hence,  we  get 

Corollary  9.6  In  the  left-linear  case,  SN  A  CON^  is  a  maximal  criterion. 

Thus,  the  result  of  [9]  cannot  be  improved. 

The  remarkable  thing  is  that  with  these  two  vaccines  we  have  completed  the  analysis 
of  the  modular  space,  since  they  form  a  v-basis: 

Theorem  9.7  The  m-dimension  for  left-linear  TRSs  is  two,  and  a  vaccines  basis  is 
given  by  {F(0, 1,  X)  F{X,  X,  X)},  {or(X,  Y)  ^  X,  or(X,  Y)-^Y}. 

That  is  to  say,  the  above  two  optimal  criteria  completely  solve  the  problem  of 
modularity  of  termination  for  left-linear  TRSs:  there  are  no  other  optimal  criteria  and 
all  the  other  criteria  are  subsumed  by  one  of  the  two. 

Also,  being  the  m-diraension  2,  by  Corollary  6.6  we  have  a  characterization  of  the 
v-bases:  they  are  just  the  counterexamples. 

As  far  as  d-criteria  are  concerned,  Middeldorp  in  [13]  showed  that  whenever  one  of 
two  terminating  TRSs  is  both  non-collapsing  and  non-duplicating,  then  their  disjoint 
sum  is  terminating;  that  is  to  say,  he  proved  that  “terminating  and  non-collapsing  and 
non-duplicating”  is  a  disjunctive  criterion.  Toyama,  Klop  and  Barendregt  showed  in 
[20]  that  whenever  one  of  two  terminating  TRSs  is  confluent  and  non-collapsing,  then 
their  disjoint  sum  is  terminating  (hence,  they  proved  that  “terminating  and  confluent 
and  non-collapsing”  is  a  d-criterion). 

Using  the  result  on  d-criteria  (Theorem  7.3),  we  can  properly  generalize  both  of 
these  results  in  the  left-linear  case,  giving  the  best  d-criterion  (the  kernel): 

Theorem  9.8  For  left-linear  TRSs,  CON~*  A  Cg -termination  is  the  greatest  disjunc¬ 
tive  criterion  for  the  modularity  of  termination. 

We  now  consider  c-structures.  Ohlebusch  in  [15]  (again,  extending  a  result  of  Gram- 
lich  in  [5]  for  finitely  branching  TRSs),  showed  that  in  every  counterexample  one  of  the 
TRSs  is  not  Cf-terminating  and  the  other  is  collapsing  (hence,  in  our  terminology,  he 
showed  that  {  Cf -termination,  non-collapsibility  }  is  a  c-structure).  Schmidt- Schaufl, 
Marchiori  and  Panitz  showed  in  [18]  that,  in  the  left-linear  case,  in  every  counterexam¬ 
ple  one  of  the  TRSs  is  CON”"  and  the  other  is  -iCON^  (that  is,  {  CON”*,  -iCON”*  } 
is  a  c-structure).  Both  of  these  results  require  a  not  easy  proof.  Via  Theorem  8.2,  we 
can  easily  not  only  generalize  all  of  these  results  in  the  left-linear  case,  but  also  provide 
the  best  c-structure: 

Theorem  9.9  {-iCON”^  A  -termination  A  SN}  is  the  best  counterexample 

structure. 

The  above  theorem  gives  the  following  result:  in  every  counterexample  to  the  mod¬ 
ularity  of  termination,  one  of  the  TRSs  is  non  consistent  and  the  other  is  non  Ce- 
terminating. 

Other  applications,  as  mentioned  in  Section  6,  include  the  possibility  to  give  easy 
proofs  of  previously  existing  results  on  modularity  (for  example  the  results  in  [17]  and 
[13]  can  be  provided,  in  the  left-linear  case,  with  an  easy  proof). 
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Finally,  the  optimality  of  the  v-basis  allows  to  infer  right  away  results  on  the  relative 
strength  of  other  criteria. 

For  instance,  it  has  been  directly  proved  with  some  effort  in  [5]  that  Simple  Ter¬ 
mination  implies  C^-termination,  and  that  termination  plus  non-duplication  imply  ex¬ 
termination.  These  results  immediately  follow  from  Theorem  9.2,  once  noticed  that 
Simple  Termination  ([8])  and  termination  plus  non-duplication  ([17])  are  criteria,  and 
that  {or(X,y)  ^  X,or(X,y)  y}  is  both  simply  terminating  and  non-duplicating. 

10  Remarks 

In  this  extended  abstract  we  have  sketched  the  core  of  the  theory  of  vaccines,  and 
presented  as  a  particular  instance  some  successful  applications  to  modularity  in  term 
rewriting.  However,  so  far  the  theory  of  vaccines  has  been  employed  to  obtain  a 
variety  of  other  results.  For  instance,  we  have  applied  it  to  study  the  modularity 
problem  for  completeness  and  uniqueness  of  normal  forms  w.r.t.  reduction  (UN  ), 
finding  vaccines  for  their  modularity,  and  this  way  improving  many  existing  results  so 
far  obtained  in  the  literature.  Also,  besides  many  other  results  which  are  variations  and 
generalizations  of  the  main  results  here  presented,  we  have  investigated  the  major  topic 
of  multimodularity,  where  other  combinations  of  more  than  two  objects  are  studied 
(see  [10,  12]).  Again,  via  a  v-basis  we  can  obtain  precise  information  on  what  kind  of 
multimodular  behaviour  a  certain  property  satisfies. 

Currently,  we  are  investigating  practical  applications  of  the  theory  to  the  study  of 
modularity  for  other  paradigms,  like  functional  or  logic  programming  (cf.  [2]).  Note 
that  even  in  the  rewriting  field  there  are  still  many  other  modularity  topics  to  which 
the  theory  of  vaccines  can  be  applied,  including  e.g.  more  involved  combinations  of 
TRSs  (like  composable  ones,  cf.  [16]  for  a  survey),  higher  order  rewriting  in  its  various 
forms  (see  e.g.  [7,  6]),  conditional  rewriting  ([7,  14]),  combinations  with  A-calculus  and 
systems  in  the  A-cube  (cf.  e.g.  [1]),  and  so  on.  For  instance,  the  theory  of  vaccines 
can  be  applied  to  the  criterion  developed  in  [4]  for  conditional  rewriting,  showing  that 
it  is  optimal  for  finitely  branching  CTRSs.  Also,  we  have  shown  that  the  theory  of 
vaccines  nicely  interacts  with  unraveling  theory  (cf.  [11]),  and  shown  how  one  can  thus 
automatically  translate  a  lot  of  modularity  results  from  term  rewriting  to  conditional 
rewriting:  for  instance,  we  have  lifted  the  result  of  Theorem  9.7,  showing  that,  for 
left-linear  normal  CTRSs,  the  same  two  TRSs  provide  a  v-basis  for  decreasingness. 
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Abstract.  The  equivalence  problem  for  deterministic  pushdown  au¬ 
tomata  is  shown  to  be  decidable.  VVe  exhibit  a  complete  formal  system  for 
deducing  equivalent  pairs  of  deterministic  rational  series  on  the  alphabet 
associated  with  a  dpda  A4. 
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1  Introduction 

The  so-called  “equivalence  problem  for  deterministic  pushdown  automata”  (dpda  for  short),  is  the  fol¬ 
lowing  decision  problem: 

INSTANCE:  two  dpda  A,  B.  QUESTION:  L(A)  =  L(B)? 

where  L(A)  (resp.  L{j5))  is  the  language  recognized  by  A  (resp.  B).  (This  problem  is  often  denoted  by 
Eq{D,  D),  where  D  stands  for  the  class  of  all  dpda).  The  question  of  whether  this  problem  is  decidable  or 
not  is  raised  in  [GG66]  and  has  received  much  attention  since  this  time.  Beside  the  fact  that  this  question 
was  natural  from  the  point  of  view  of  formal  language  theory,  it  appeared  later  as  Turing-equivalent  with 
other  equivalence-problems  for  different  types  of  recursive  program  schemes  (see  [Cou90]  for  a  survey  ). 
Some  other  Turing-equivalent  problems  on  semi~Tkue  systems  were  also  found  (see  [Sen94]  for  a  survey) 
and  formulations  in  terms  of  bisimulation  equivalence  of  infinite  graphs  (or  processes)  have  been  found 
too  (see  [Cau95]  for  a  survey). 

Among  a  large  number  of  papers  let  us  only  quote  [Val74,  VP75,  Bee76,  Rom85,  Oya87,  Sti96]  which 
proved  decidability  of  Eq{D',D')  for  subclasses  D'  of  the  full  class  D  of  dpda.  (We  refer  the  reader 
to  the  surveys  ([Cou90,  Cau95,  Lis96])  for  other  results  on  problems  related  to  Eq{D,D)).  The  work 
[Mei89,  Mei92]  is  an  attempt  to  solve  the  general  problem.  On  account  of  its  incompleteness  (see  for 
example  the  comment  in  [Lis96,  p.219])  it  does  not  provide  a  full  solution;  nevertheless  it  introduced  a 
fundamental  new  idea:  the  notion  of  linear  independance  for  languages. 

We  prove  here  that  the  equivalence  problem  for  dpda  is  decidable  (theorem  9.3). 

We  obtain  this  result  by  providing  a  complete  formal  system  “Vq  for  equivalence  identities  between 
deterministic  rational  series  (we  use  here  a  type  of  formal  system  inspired  by  [Cou83]  and  a  notion  of 
deterministic  series  inspired  by  [HHY79]).  The  proof  of  this  completeness  property  leans  on  three  types 
of  arguments: 

-  in  section  3  we  develop  around  the  fundamental  idea  of  [Mei89,  Mei92]  an  algebraic  theory  of  “d- 
s  paces”, 

-  in  sections  5.7  the.se  structure  results  are  turned  into  a  construction  of  strategies  for  the  formal  system 

I>o.  ' 

-  in  section  8  we  analyze  the  infinite  trees  generated  by  some  strategies  associated  with  "Po- 
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2  Preliminaries 

2.1  Pushdown  automata 

A  pushdown  automaton  on  the  alphabet  A'  is  a  6-tuple  M  —  <  yX,Z,Q,S,qo,zo  >  where  Z  is  the  finite 
stack-alphabet,  Q  is  the  finite  set  of  states,  qo  €  Q  is  the  initial  state,  20  is  the  initial  stack-symbol  and 
6  :QZ  X  {X  U  {f})  Vj{QZ*),  is  the  transition  mapping. 

Let  q,q'  €  G  Z\z  G  ZJ  G  A"*  and  a  G  A  U  {e}  ;  we  note  {qzu,af)  1 — m 

if  q'ljj'  G  S{qz,a).  ' — is  the  reflexive  and  transitive  closure  of  1 — <-m  •  For  every  qu.q'uj'  G  QZ* 
and  /  G  A"’,  we  note  qu  -^m  iff  (9^./)  M  is  said  deterministic  iff,  for  every 

zez,qeQ: 

either  Card{5(92,  e))  =  1  and  for  every  x  G  A,  Card((5(^r,  x))  =  0,  (1) 

or  CB.Td{6{qz,  e))  =  0  and  for  every  x  G  A,  Card((5(?2,  x))  <  1.  (2) 

M  is  said  real-time  iff,  for  every  qz  G  QZ,  Card(^(?z,  c))  =  0.  A  dpda  M  is  said  normalized  iff,  for  every 
qz  G  Q^.xG  A: 

G  6{qz,  x)  u)'  j<  2,  and  q'u'  G  S{qz,  e)  =>|  w'  1=0  (3) 

Given  some  finite  set  F  C  QZ*  of  configurations,  the  language  recognized  by  M  with  final  configurations 
F  is  defined  by  L{M>F)  =  {w  £  X*  j  3c  G  A,  qozo  — c}. 

2.2  Deterministic  context-free  grammars 

Let  M  be  some  deterministic  pushdown  automaton  (  for  sake  of  simplicity  we  suppose  here  that  M  is 
normalized).  The  variable  alphabet  Vm  associated  to  M  is  defined  as:  Vm  =  {\p,^>q] !  Pi9  ^  €  Z}. 

The  context-free  grammar  Gm  associated  to  Ai  is  then  Gm  X,V,P>  where  V  —  Vm  and  P  is  the 
set  of  all  the  pairs  of  one  of  the  following  forms: 

([p,z,9],x[p',zi,p"][p",^^2,9])  or  {\p,z,qlxy,z',q])  or  ^z^qla)  (4) 

where  p,q  G  Q,z  G  A,x,x'  G  A, a  G  A  U  {e},p'ziZ2  G  6{pz,x),p'z'  G  S{pz,x'),q  G  S{pz,a).  Gm  is 
a  strict-deterministic  grammar.  (A  general  theory  of  this  class  of  grammars  is  exposed  in  [Har78]  and 
used  in  [HHY79]).  We  call  mode  every  element  of  QZ  U  {e}.  For  every  q  e  Q,z  e  Z,  qz  is  said  e-bound 
(respectively  e-free)  iff  condition  (1)  (resp.  condition  (2))  in  the  above  definition  of  deterministic  automata 
is  realized.  The  mode  e  is  said  f-free.  We  define  a  mapping  p  :V*  QZ  U  {e}  by 

p{€)  =  e  and  p{\p,  z,  ?]•/?)=  pz, 

for  every  p,q  G  Q,  z  €  Z,  (3  £  V* .  For  every  u;  G  V*  we  call  p{w)  the  mode  of  the  word  w. 

For  technical  reasons  (  which  will  be  made  clear  in  section  7),  we  suppose  that  Z  contains  a  special 
symbol  e  such  that,  for  every  q  E  Q,  S(qe,  e)  =  {q}  and  im((5)  C  'Pj{Q{Z  -  {e))”)- 

2.3  Free  monoids  acting  on  semi-rings 

Semi-ring  B  <  W  >  Let  (B,-}-,  -.0, 1)  where  B  =  {0, 1}  denote  the  semi-ring  of  “booleans”.  Let  W  be 
some  alphabet.  By  (B  <  W  >,  -f ,  •,  0,  e)  we  denote  the  semi-ring  of  boolean  series  over  VV:  every  boolean 
series  5  G  B  <  W  >  can  be  written  in  a  unique  way  as:  5  =  X^ewSu,  ■  w,  where,  for  every  u;  G  W*, 
5„,  G  B.  The  support  of  S  is  the  language 

supp(5)  =  {weW*  \Su,y^  0}. 

In  the  particular  case  where  the  semi-ring  of  coefficients  is  B  (  which  is  the  only  case  considered  in  this 
article)  we  sometimes  identify  the  series  S  with  its  support.  We  recall  that  for  every  5  G  B  <  W  >, 
S‘  is  the  series  defined  by:  5*  =  X^o<„  -S’”.  Given  two  alphabets  W,  W ,  a  map  ^:B<W>  —  B< 
W  >  is  .said  cr-additive  iff  it  fulfills: Tor  every  denumerable  family  (5i)i6W  of  elements  of  B  <  W  >, 
‘5’> )  =  HiGiN  ^  B<W'>  which  is  both  a  semi-ring  homomorphism 

and  a  o’-additive  map  is  usually  called  a  substitution. 

’  but  without  loss  of  generality  for  the  equivalence  problem 
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Actions  of  monoids  Given  a  semi-ring  (S,  -f,  •,  0. 1)  and  a  monoid  (M ,  a  map  o  ;  S  x  M  S  is 

called  a  righUaciion  of  the  monoid  M  over  the  semi-ring  S  iff,  for  every  S,T  E  S.m,  m'  G  M: 

0  o  m  =  0,  So  1a/  =  5’,  (5  +  T)  o  m  =  (5  o  m)  4-  (T  o  m)  and  S  o  {m  ■  m')  -  {S  o  m)  o  m'  (5) 

In  the  particular  case  where  S  =  B<H^>,ois  said  to  be  a  tr-right-action  if  it  fulfills  the  additional 
property  that,  for  every  denumerable  family  (5i)iGiN  of  elements  of  S  and  m  £  M: 

(^2  Si)  o  m  —  ^^{Si  o  m).  (6) 

iglN  leIN 

The  action  of  VT"  on  B  <  VF  >  We  recall  the  following  classical  (x-right-action  •  of  the  monoid  W‘ 
over  the  semi-ring  B  <  W  >  :  for  all  5,  5^  G  B  <  W  >,  u  G  W* 

5  •  u  =  5'  Vu;  G  W’,  (5;  =  1  iff  5„.,^  =  1). 

(i.e.  S*u  is  the  left-quoiieni  of  5  by  u  ,  or  the  residual  of  5  by  u  ).  For  every  5  G  B  <  W  >  we  denote  by 
Q(5)  the  set  of  residuals  of  S:  Q{S)  =  {5*  r*  !  u  G  W*}.  We  recall  that  S  is  said  rational  iff  the  set  Q(5) 
is  finite.  We  define  the  norm  of  a  series  5  G  B  <  VF  >,  denoted  t|>S'||  by:  j|5||  =  Card(Q(S))  G  IN  U  {oo}. 

The  action  of  Jt'  on  B  <  V  >  Let  us  fix  now  a  deterministic  (normalized)  pda  M  and  consider 
the  associated  grammar  G.  We  define  a  cr-right-action  (S)  of  the  monoid  (X  U  {e})  over  the  semi- ring 
B  <  F  >  by:  for  every  p,q  E  Q,A  G  X,If  G  V*,fi  ^  V‘,x  G  X 

(p,A,q]-fi0x^/{-fiifr([p,A,q],X’ff)GP,  {p,  A,  q]  ■  fi  0  e  -  If  ■  fi  iff  ([p,  A,  q],  H)  G  P  (7) 

e0x  — a,  €0e  =  (/\.  (8) 

A  series  5  G  B  <  F  >  is  said  e-free  iff  Vu;  G  F* ,  5^,;  =  1  p(w)  is  e  -  free.  We  denote  by  B^  <  F  >  the 
subset  of  e-free  series.  We  define  the  map  pe  :  B  <  F  >-♦  B  <  F  >  as  the  unique  o--additive  map  such 
that,  for  every  p  G  Q,  ^  G  9  G  Q,  /?  G  F* , 

Peilp,  ?]  •  /?)  =  Pt((lpi  ?]  ®  e)  •  /^)  P^  is  f  -  bound,  p«([p,  z,  q]  ■  fi)  =  [p,  x,  q]  •  if  pz  is  e  -  free, 

and  Pf(e)  =  e.  The  above  definition  is  sound  because,  by  hypothesis  (3),  every  [p,z,q]0e  is  either  the 
unit  series  e  or  the  empty  series  0.  One  can  notice  that  for  every  tt;  G  F*,  Pi{w)  e  F*  U  {0}.  We  call  pj 
the  c-reduction  map.  We  then  define  O  as  the  unique  right-action  of  the  monoid  X*  over  the  semi-ring 
B  <  F  >  such  that:  for  every  5GB<F>,arGA,50a:  =  Pe{Pt{S)  0  a:).  One  can  notice  that  if  u  e, 
then  S  O  «  is  e-free.  Let  us  consider  the  unique  substitution  <p:B<F>-*B<X>  fulfilling:  for  every 
p,q  G  Q,z  G  Z,  p([p,  r,  7])  =  {u  G  A*  j  [p,  r,  7]  O  «  =  e},  (in  other  words,  p  maps  every  subset  L  C  F* 
on  the  language  generated  by  the  grammar  G  from  the  set  of  axioms  L). 

Lemma  2.1  p  is  a  morphism  of  right-actions  i.e.  for  every  5  G  B  <  F  >,  u  G  A'*,  y(S 0  u)  =  p{S)  •  u  . 
We  denote  by  =  the  kernel  of  <p  i.e.:  for  every  S,T  G  S  <  V  >,  S  =  T  p{S)  =  p{T). 


3  Series  and  languages 

3.1  Deterministic  series  and  matrices 

We  introduce  here  a  notion  of  deterministic  series  which,  in  the  case  of  the  alphabet  F  associated  to  a 
dpda  M,  generalizes  the  classical  notion  of  configuration  of  M.  The  main  advantage  of  this  notion  is 
that,  unlike  for  configurations,  we  shall  be  able  to  define  nice  algebraic  operations  on  these  series  (this 
is  done  in  section  3.2).  Let  us  consider  a  pair  (kV,~)  where  W  is  an  alphabet  and  ~  is  an  equivalence 
relation  over  IF.  We  call  (fF,  ~)  a  structured  alphabet.  The  two  examples  we  have  in  mind  are: 

-  the  case  where  IF  =  F,  the  variable  alphabet  associated  to  M  and  [p,  A.  7]  ~  [p^  A',  7']  iff  p  =  p'  and 
A  =  A'  (see  [Har78]) 

-  the  case  where  IF  =  A',  the  terminal  alphabet  of  M  and  x  ~  y  holds  for  every  x,y  G  A'"  (see  [Har/8]). 
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Definition  3.1  Lei  S  e  B  <  W  >.  S  is  said  left-deterministic  iff  either  (1)  5  =  0  or  (2)  S  =  e  or 
(3)  Vu',  w'  e  W’,S^  =  =  1  =>  3A,  A'  G  W,  wi,  w[  e  W\A A\w  =  A  ■  wi  and  w'  =  A'  ■  w[. 

Definition  3.2  LeiS  &  B  <  W  >.  S  is  said  deterministic  iff,  for  every  u  G  W' ,  S*u  is  left-determinisiic. 

This  notion  is  the  straighforward  extension  to  the  infinite  case  of  the  notion  of  (finite)  set  of  associates 
defined  in  [HHY79]. 

We  denote  by  DB  <  PY  >  the  subset  of  deterministic  boolean  series  over  W.  Let  us  denote  by  > 

the  set  of  (n,  m)-matrices  with  entries  in  the  semi-ring  B  <  >. 

Definition  3.3  iet  m  G  IN,5  G  Bi^rn  <  W  >:  S  =  S  is  said  left-deterministic  iff  either 

(1)  Vi  G  [l,m],Si  =  0  or  (2)  3io  G  =  e  and  Vi  ^  io,5j  =  0  or  (3)  'iw,xv'  G  G 

[l,m],(5i)^  =  =  1  3yl,  A'  G  W,wi,w[  G  V,A  ~  A',w  =  A  ■  wi  and  w'  =  A'  ■  w[. 

The  right-action  •  on  B  <  W  >  is  extended  componentwise  to  B„,m  <  W  >:  for  every  S  =  (si.j), 
«  G  W*,  the  matrix  T  =  5  •  u  is  defined  by  Uj  -  Sij  •  u. 

Definition  3.4  Lei  S  G  <  W  >.  5  is  said  deterministic  iff,  for  every  u  G  W*,  S  •  u  is  (eft- 

deterministic. 

We  denote  by  DBi,m  <  >  the  subset  of  deterministic  row- vectors  of  dimension  m  over  B  <  W  >. 

Definition  3.5  Lei  S  G  Bn,m  <W>.S  is  said  deterministic  iff,  for  every  i  G  [1,  n],  Si,,  is  a  determin¬ 
istic  row-vector. 

The  following  property  is  crucial  for  establishing  a  correct  theory  of  deterministic  spaces  (see  §3.2 
below). 

Lemma  3.6  For  every  S  G  DBn,m  <  W  >,  T  G  DB^.a  <  W  >,  5  •  T  G  DB^.j  <  W  >. 

W=V  Let  (W,-^)  be  the  structured  alphabet  (F,~)  associated  with  M  and  let  us  consider  a  bijective 
numbering  of  the  elements  of  Q:  (?i,  92)  •  • -i  ?no)-  Some  particular  “vectorial’  notions  turn  out  to  be 
useful: 

-  we  define  a  Q-series  to  be  a  family  {S^)q€Q  such  that  the  row-vector  (5^,  ,Sq^,...,  )  is  determin¬ 

istic 

-  we  define  a  Q-form  to  be  a  family  ^  =  i^q)qeQ  of  deterministic  series. 

Given  a  Q-.series  5  and  a  Q-form  their  Q-product  S*^is  the  deterministic  series  defined  by  5  * = 

Sq-^q.  If  the  Q-series  {Sq)q^Q  is  identified  with  the  row-vector  (5^,,  5^, . Sq^^ )  and  the  Q-form 

(^JgeQ  with  the  column- vector  {^qj)je[imQ],  t^en  the  Q-product  appears  to  be  just  the  ordinary  product 
of  matrices. 

Let  us  define  here  handful  notations  for  some  particular  row-vectors  or  Q-series.  Let  us  use  the 
Kronecker  symbol  Sij  meaning  e  if  i  =  j  and  0  if  z  j.  For  every  1  <  n,  1  <  *  <  we  define  the 
row-vector  ef  as;  e”  =  (f”;)i<i<M  where  V;,  =  6ij.  We  call  unit  row-vector  3.ny  vector  of  the  form  ef. 

For  every  u  £  Z"  ,p,q  €.  Q,  [^9]  is  the  deterministic  series  defined  inductively  by; 

|>C9]  =  0  if  p  7^  9,  \jpeq]  =  e  if  p  =  9, 

[puiq]  —  ^  [pAr]  ■  [ru;'9]  if  cj  =  A  •  a;'  for  some  A  G  w'  G  . 
req 

By  [pw]  we  denote  the  Q-series;  [pw]  =  ([pt^9])?eQ-  (In  particular  [9,]  =  By  [w]  we  denote  the 

Q-matrix;  [w]  =  {{p^q])p^Q  q^q  -  The  next  lemma  relates  the  right-action  0  with  the  right-action  •. 

Lemma  3.7  Let  5  G  DB  <  V  >,  u  G  X" .  One  of  the  three  following  cases  must  occur:  (1)  5©  u  =  0,  or 

(2)  5  ©  u  -  f,  or  (3)  3ui,  U2  G  Y*,  vi  G  V^*,  9  G  Q,  A  G  2",^  Q  -  form  such  that  u  =  ui  •  uo.  2  G  = 
S  •  Vi  =  [9A]  *  0  and  5  0  u  =  ([9A]  ©  U2)  * 

Corollary  3.8  Let  S  G  DB  <  V  >,u  G  X" .  Then  5  ©  u  G  DB  <  V  >  . 

The  particular  letters  [p,  e,  9]  for  p.q  E  Q  play  a  special  role  in  sections  7  and  8:  we  use  them  as  marks 
in  the  series  (  somehow  like  the  ceilings  of  [Val74]).  We  define  below  a  map  pe  which  removes  the  marks 
in  the  series.  Let  us  define  p«  ;  DB  <  V  > —  B  <  K  >  as  the  unique  substitution  such  that: 

P«(l>,  e,9])  =  e  ifp  =  9)  Pe([P)  «)?])  =  ® 

Lemma  3.9  For  every  5  G  DB  <  K  >.  peiS)  G  DB  <  V'  >  and  |jpe(5)l|  <  i|5[|. 
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Rational  series,  norm  Let  us  generalize  the  definition  of  rationality  of  series  in  B  <  W  >  to  matrices. 
Given  M  G  >  we  denote  by  Q(A/)  the  set  of  residuals  of  M:  Q(A/)  =  {A/  •  u  j  u  G 

Similarly,  we  denote  by  Qr{M)  the  set  of  row-residuals o(  M:  Qr(M)  =  Ui<Kn  -V/  is  said  rational 

iff  the  set  Q(A/)  is  finite.  One  can  check  that  it  is  equivalent  to  the  propeft/ that  every  coefficient  A/,- j  is 
rational,  or  to  the  property  that  Qr(A/)  is  finite.  We  denote  by  <  W  >  the  set  of  deterministic, 

rational  matrices  over  B  <  W  >.  For  every  M  G  DRB„  ^  <  W  >,  we  define  the  norm  of  A/  as: 
[|A/||  =  Card(Q,(A/)). 

Lemma  3.10  Lei  A  G  DB^^rn  <  W  >,  R  G  DBm,*  <  I'F'  >.  Then  \\A  ■  B|j  <  |[A||  +  H^H- 
3.2  Deterministic  spaces 

We  adapt  here  the  key-idea  of  [Mei89,  Mei92]  to  series. 

Definitions  Let  (W,  ~)  be  some  structured  alphabet  and  let  us  consider  the  set  E  =  DRB  <  W  >.  A 
series  U  =  •  Ui  where  7  G  DRBi,„  <W>,  Ui  G  DRB  <  W  >  is  called  a  linear  combination  of 

the  Ui's.  We  call  deterministic  space  of  rational  series  (  d-space  for  short)  any  subset  V  of  which  is 
closed  under  finite  linear  combinations.  Given  any  set  Q  —  {Ui\i  G  7},  one  can  check  that  the  set  V  of 
all  (finite)  linear  combinations  of  elements  of  ^  is  a  d-space  (  by  lemma  3.6)  and  that  it  is  the  smallest 
d-space  containing  Q.  Therefore  we  call  V  the  d-space  generated  by  Q  and  we  call  Q  a  generating  set  of  V 
(  we  note  V  =  V({17i|i  G  /})).  (  Similar  definitions  can  be  given  for  families  of  series). 

We  let  now  W  =  V.  Following  an  analogy  with  classical  linear  algebra,  we  develop  now  a  notion  cor¬ 
responding  to  a  kind  of  linear  independence  of  the  images  by  (p  of  the  given  series  Let  us  extend  the 
equivalence  relation  =  to  d-spaces  by:  for  every  d-spaces  Vi,V2,  Vi  =  V2  Vi,  j  G  {1,2},V5  G  V,-,  35'  G 
Vj.5  =  5'. 

Lemma  3.11  Let  Si, . .  .,Sj, . .  .,Sm  e  DRB  <V  >.  The  following  are  equivalent 

1.  3a,/3G  DRBi,m  <V  >,  a  such  that  ■  Sj  = 

2.  3jo  6  [1,  m],  37  €  DRBi,m  <  V'  >.7  ^  =  Ei<;<m  Ti  • 

3.  3io  e  [1,  m],  37'  G  DRBi.m  <  K  >,  =  0,  such  that  Sj,  =  Ei<;<m  '  Sj , 

4.  3jo  G  such  that  V((5;)i<j<m)  =  '^i{Sj)i<j<m,j7ijo)- 

The  equivalence  between  (1),(2)  and  (3)  was  first  proved  in  [Mei89,  Mei92],  in  the  case  where  the  Sj's 
are  configurations  qjui,  with  the  same  ui, 

4  Deduction  systems 

4.1  General  deduction  systems 

We  follow  here  the  general  philo.sophy  of  [HHY79,  Cou83].  Let  us  call  deduction  system  any  triple 
T>  =<  A,  H,  I —  >  where  .4  is  a  denumerable  set  called  the  set  of  assertions,  H,  the  cost  function 
is  a  mapping  4  —  IN  U  {00}  and  | —  ,  the  deduction  relation  is  a  subset  of  V/iA)  x  4  ;  >1  is  given 
with  a  fixed  bijection  with  IN  (an  “encoding”  or  “Godel  numbering”)  so  that  the  notions  of  recursive 
subset,  recursively  enumerable  subset,  recursive  function,  ...  over  A,Vf{A),...  are  defined,  up  to  this  fixed 
bijection  ;  we  assume  that  V  satisfies  the  following  axioms: 

(A  1)  ] —  is  recursively  enumerable 

(A  2)  V(P,  A)  G  h-  ^  {77(p),p  G  P}  <  H{A))  or  (i7(.4)  =  00).  (We  let  mm(0)  =  00). 

In  the  sequel  we  use  the  notation  P  ] —  A  for  (P.  A)  G  ] —  •  We  call  proof  in  the  system  V,  any  subset 
P  C  A  fulfilling  ■.'ip  ^  P,  (3Q  C  P,Q  j —  p).  Let  us  define  the  total  map  y  :  A  — *•  {0, 1}  and  the  partial 
map  Y  :  ‘  {0. 1}  by  ; 

\'(A)  =  1  if  5(A)  =  30,  \:(A)  =  0  if  5(.4)  <  00,  Y(^4)  =  1  if  5(A)  =  oc.Y  is  undefined  if  5(.4)  <  00. 
(y  is  the  'truth- value  function”,  y  is  the  “l-value  function”). 

Lemma  4.1  Let  P  be  a  proof  and  A  G  P.  Then  y(A)  =  1. 
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In  other  words  :  every  provable  assertion  is  true.  The  deduction  system  V  will  be  said  complete  iff, 
conversely,  VT  G  A,  \;(/4)  =  1  =>  there  exists  some  finite  proof  P  such  that  A  G  P-  (In  other  words,  V 
is  complete  iff  every  true  assertion  is  “finitely”  provable). 

Lemma  4.2  ;  IfV  is  complete,  x  «  a  recursive  partial  map. 

In  order  to  define  deduction  relations  from  more  elementary  ones,  we  set  the  following  definitions.  Let 
L_  CVf{A)x  A.  For  every  P,  Q  G  V/iA)  we  set: 

rni  ni  <0> 

p  |L  g  iff  P  D  Q;  P  [—  Q  iff  G  Q,  3P  c  P,  P  I —  q\  P^|~  Q  iff  P^H  P  }—  Q  iff  V?  G 

Q.  {3R  C  P,  P  h-  9)  or  (q  G  P);  P^K^Q  iff  3P  G  Vj{A),  P\—R^ndR^Q  (for  every  n  >  1).; 

<•>  <n> 

Given  f^"°,  C  Vj{A)  x  Vf{A),  for  every  P,Q  €  Vf{A)  we  set  :  P(  H  i  °  h"  2)^  ^ 

^,(P[—  iP)A(Ph-2<5)- 


4,2  System  Vo 

Let  us  define  here  a  particular  deduction  system  Vq  “Taylored  for  the  equivalence  problem  for  dpda’s”. 


Given  a  fixed  dpda  M  over  the  terminal  alphabet  X,  we  consider  the  variable  alphabet  V  associated 
to  M  (see  section  3.1)  and  the  set  DRB  <  >  (the  set  of  Deterministic  Rational  Boolean  senes  over 

V^).  The  set  of  assertions  is  defined  by  :  >1  =  IN  x  DRB  <  K  >  xDRB  <  7  >  i.e.  an  assertion  is  here  a 


weighted  equation  over  DRB  <  V  >.  c^t\ 

The  “cost-funetion”  :  .4  -  IN  U  {00}  is  defined  by  :  H(n,  S.S')  =  n  +  2-  Div(5,  S'),  where  D.v(S,  S 
the  divergence  between  S  and  S',  is  defined  by  :  Div(S,S')  =  min{|  n  ||  n  €  A(,r(S),y.(S'))}.  (We  recall 


min(0)  =  00). 

Let  us  notice  that  here  ;  x(r^,  S,  S')  =  1  <==>  S  =  S'.  ,  .  c 

VVe  define  a  binary  relation  |t~  C  P/(.4)  x  A,  the  elementary  deduction  relation,  as  the  set  of  all 

the  pairs  having  one  of  the  following  forms: 


(PO) 

{(P,5,:r)} 

If- 

ip+hS,T) 

(PI) 

{{P,S,T)} 

iK 

{PiT,S) 

(P2) 

{{p,S,S'),{p,S',S")} 

l(~ 

{P,S,S") 

(P3) 

0 

If- 

(0,5,5) 

(P'3) 

0 

If- 

(0,[9H,e) 

(P4) 

{(p+l,5©x,rOa:)|xGX} 

IK 

(p,5,T) 

(P5) 

{{P,S,S')} 

IH- 

(p  +  2,50x,5'© 

(P6) 

{{P,sr  +  T,T')} 

IK 

(p,5*  ■r,r) 

(P7) 

{(p,S,S')} 

11- 

(p,5  +  T,  5'  +  r) 

(P8) 

{{PiS,S')} 

Ih- 

(p,  S  ■  T,  S'  T) 

(P9) 

{{p,S,S')} 

If- 

(p,U-S,U-  S') 

(for  q,reQ,-eZ,  [qzr]  =  e) 
(for  S  ^  e  A  r  ^  e) 

(for  x  G  X) 

(for  5  ^  £) 


where  p  G  IN,  S’, 5', T, T'  G  DRB  <  7  >,  17  G  RB  <  7  >.  (By  set  of  “all”  these  pairs  we  mean,  all  the 
pairs  which  fulfill  both  properties  “to  belong  to  VfiA)  x  A''  and  “to  have  one  of  these  11  possible  forms” 
;  but  of  course,  for  example,  not  all  the  triples  (p,  S-{-T,S'  +  T)  belong  to  A  because  DRB  <  7  >  is  not 
closed  under  sum). 


Lemma  4.3  ;  Lei  P  G  VfiA),Ae  A  suck  that  P  \\— A.  Then  min{H{p)  1  p  G  P}  <  H{A). 

<->  [1]  <*> 

Let  us  define  ( —  by  :  for  every  P  G  Vf{A),A  G  >1,  P  | —  A  <;=>  P  ||—  o  1|  0,3,4°  il 

^vhere  [j—  0,3,4  is  the  relation  defined  by  Po,  P3,  P3.  P4  only.  We  let  Vo-<  A,H,  >  - 


Lemma  4.4  ;  Vq  is  a  deduction  system. 

The  kev-statement  of  this  work  is  that  Vq  is  complete  (theorem  9.2).  We  prove  this  completeness  result 
bv  exhibiting  a  “strategy”  S  which,  for  every  true  assertion  (n,5.5'),  constructs  a  finite  X>o-proof  of 
this  as.sertion.  Notice  that,  by  lemma  4.2.  we  do  not  need  to  prove  that  S  is  computable  in  any  sense  to 
establish  that  y  is  partial-recursive. 
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4.3  Strategies 

Let  V  =<  .4,  H,  I —  >  be  a  deduction  system.  We  call  a  strategy  for  V  any  partial  map  ;  .4+  —  A* 
such  that  : 

(51)  iiS{AiA2  ■  ■  ■  An)  -  Bi  ■  -  Bm  then  C  {4,-  [  1  <  i  <  n  -  1}  such  that 

{Bj  I  1  S  i  ^  *4  Q  I -  An  , 

(52)  if  6'(4i42---4„)  =  Bi  ■  ■  ■  Bm  then 

min{//(4,)  [  1  <  i  <  n}  =  oo  =>  min{H{Bj)  1  1  <  j  <  =  oo- 

Given  a  strategy  S,  we  define  T(tS,4.),  the  proof-tree  associated  to  the  strategy  S  and  the  assertion  A 
as  the  unique  tree  t  such  that  ; 

s:  £  dom{t),  t{£)  =  A,  and,  for  every  path  xoa^i,  ■  •  -Xn-i  in  i,  with  labels  =  4i+i  (for  0  <  i  <  n  —  1) 
if  x„-i  has  m  sons  x„_i  ■  1,  •  •  •  .x„_i  ■  m  £  dom(t)  with  labels  i(x„_i  •  j)  =  Bj  (for  1  <  j  <  m)  then 

5(4i  •  “An)  =  Bi-“Bm  or  (m  =  0  and  4i  •  •  -  An  ^  dom(iS)). 

Let  us  say  that  S  terminates  iff,  V4.  £  i®  finite;  S  is  said  closed  iff,  VW  £  A'^,W  £ 

(X“^(l))‘'‘  =>  W  £  dom{S)  (i.e.  S  is  defined  on  every  non-empty  sequence  of  true  assertions). 

Lemma  4.5  .•  IfS  is  a  closed  strategy  for  V ,  then,  for  every  true  assertion  A,  the  set  of  labels  ofT{S,A) 
is  a  V -proof 

Lemma  4.6  ;  IfV  admits  some  terminating,  closed  strategy  then  V  is  complete. 

5  Triangulations 

Let  S\,  S2,  “  '  1  Sd  be  a  family  of  deterministic  series  over  the  structured  alphabet  V  (  we  recall  V  is  the 
alphabet  associated  with  some  dpda  M.  as  defined  in  section  2.2). 

Let  us  consider  a  sequence  S  oin  “weighted”  linear  equations  : 

d  d 

i^i)  -  Pi,  -J  ^  -J 

J=1 

where  pi  £  IN,  and  A  —  {aij),B  =  (Pij)  are  deterministic  rational  matrices  of  dimension  {n,d),  with 
indices  m  <  i  <  m  -f-  n  —  1, 1  <  j  <  d.  For  any  weighted  equation,  €  —  (p,  5,  S'),  we  recall  the  “cost”  of 
this  equation  is  :  H{E)  =  p  -f-  2  •  Div(^(5),  (p(5')). 

We  associate  to  such  a  system  another  system  of  equations,  INV(i5),  which  “translates  the  equations  of 
$  into  equations  over  (aij , /?t,j)  only”.  This  function  INV  is  in  some  sense  an  “elaborated  version”  of 
the  inverse  systems  defined  in  [Mei89,  Mei92].  The  general  idea  of  the  construction  of  INV  consists  in 
iterating  the  transformation  used  in  the  proof  of  (1)  =>  (2)  =>  (3)  in  lemma  3.11,  i.e.  the  classical  idea 
of  triangulating  a  system  of  linear  equations.  Of  course  we  must  deal  with  the  weights  and  relate  the 
construction  with  the  deduction  system  Vq.  Let  us  assume  here  that 

V;  e[l,cf],5,-^0.  (9) 

For  every  S  G  B  <  A'  >  (resp.  S'  £  Bi,d  <  X  >),  we  define  u{S)  =  min{|u|,u  £  supp(S)}  (resp. 
i>{S')  =  min{[u|,  u  £  Ui<j<(isupp(5y )})-  Let  us  define  INV(.5),  W(^)  G  IN  U  {J.},  D(S)  £  IN  by  induction 
on  n.  W(5)  is  the  weight  of  S.  D(«S)  is  the  weak  codimension  of  i5. 

Case  1  :  ipidm.A  ~  <p(/?m,-)  or  n  =  1 

imiS)  ^  {{WiS),an.j,i3mj))i<j<d,  \YiS)  =  Pm-l,  D{S)  =  0. 

Case  2  ;  (p{Om..)  ^  n  >  2,pm  +  l  -  Pm  >  2  ■  l/(A(v7(am,*).V^(/?m,-)))  +  1 

Let  a  =  minA((p(a,n,.)^ ‘^(An.-))-  Suppose  u  £  A{<p(am,jo),g^i/3m,jo))- 
Subcase  1  :  Om.jo  A  I^m.jo  0  «  =  0- 
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Let.  us  consider  the  equation  (Pm,  5;o,  ©  u)5j)  and  define  a  new  system  of  weighted  equations 

S'  —  )fn  +  l<i<m  +  n-l  • 

(^,0  :  Pt.  +  °^i.}oi0rn,j  ©  ^))Sj  ,  ^  (/?«.;  +  ©  ^))Sj 

j^jo 

where  the  above  equation  is  seen  as  as  an  equation  between  two  linear  combinations  of  the  Si's  where 
the  jo-th  coefficient  is  0  on  both  sides.  We  then  define  : 

INV(v?)  =INV(5'),W(5)  =  W(5'),D(<S)  =  D(5')  +  1.  (10) 

Subcase  2  :  Om.jo  Qu  =  s,  ^m,jo  ©  u  7^  0- 

Let  us  consider  the  w-equation  {p^,5,„  ,  ©  u)Si)  and  define  a  new  system  of 

weighted  equations  S'  =  by  t 


j 


'b  {0m, j  ©  • 

j  ^jo 


We  then  set  the  same  definitions  (10)  as  above. 

Subcase  3  :  ©  «  =  £•  (Analogous  to  subcase  1). 

Subcase  4  ;  am,jo  ©  «  7^  0. 0m, jc  Ou  =  e.  (Analogous  to  subcase  2). 

Case  3  :  <p{0!m  *)  ^  ^{0m,>‘),'f^>^^Pm+l  —  Pm  ■  t^{^{¥>i°‘m,*),{^i0m,*)))■ 

We  then  define:  INV(5)  =  1,  W(6’)  =  X,  D(5)  =  0,  where  X  is  a  special  symbol  which  can  be  understood 
as  meaning  “undefined”. 


Lemma  5.1  ;  Let  S  be  a  system  of  linear  equations.  If  INV{S)  ^  X  then  INV{S)  —  (^j)i<j<d  fulfills: 
1.  Wj  e  [l,d],^j  is  a  linear  equation  with  deterministic  coefficients, 

X  {^j  I  1  <  i  <  d}  U  i  m  <  J  <  m  +  D{S)  -  1}  1 —  ^m+Z)(5)  t 
If,  in  addition,  n>d  then  : 

3.  |  m  <  i  <  m  +  Z)(<5)}  =  oo  ==>  mm{H{Sj)  [  1  <  ;  <  d}  =  oo. 

Let  us  consider  the  function  F  defined  by  ; 

F(n)  =  m«r{i/(^(A)A<p(5))  |  A,  B  6  DRBi,^  <  V  >,\\  A  \\<  n,\\  B  |j<  n,  <p{A)  q>{B)}. 

For  every  integer  parameters  /\i,  A'2,  I<3,  I<4  e  ]N-{0},  we  define  integer  sequences  (6i,  Li,Si,Si,  r,)m<i<m+r.-i 

l3y  : 

Sm  ~0,(m  =  0,  Lm  =  ^2,  Sm  ~  Kz  '  Kl  +  d<'4>  =  0,  Sim  =  0,  (11) 

and  for  every  m<i<rn  +  n~2, 

5i+i  =  2  •  F(s,  +  Li)  +  1,  ^,+1  =  5  ■  6i+i  +  14,  Lf+i  =  A'l  •  (Li  +  ii+i)  +  I<2, 

si+i  =  Ks  •  Li+i  +  lu,  Si+i  =  Si  +  LiA  I  Q  \  F(s,-  +  T,),  X’,+i  =  X,-  +  S.+i-  (12) 

For  every  weighted,  deterministic  rational  linear  equation  £  —  •  Yij=i  01^})'  define 

111  £  i||=  maa?{[I  a  i|,  \\  0  i|}. 

Lemma  5.2  Let  S  =  {£i)m<i<m+d-i  a  system  of  d  weighted  linear  equations  such  that  : 

(J)  V/  €  [m,  m  +  d  -  1],  ||[  £i  |li<  s.- 

(2)  \fi  £  [m.  m  +  d  -  2],  W(F,+i)  -  W(£i)  >  <5,-+i  ■ 

Then  [NV(6')  t^X,  D{<5)  <  d  —  1,  VF  G  INX'  {S).  |||  £  ||1<  Xm4.D(.S)  +  Sm+D(.S) • 


679 


6  Constants 

The  following  constants  will  be  used  in  the  sequel. 

k-Q  =  max{^'([p.4(/])  \  p,q  E.  Q,  A  E  Z,  [p-47]  ^  0},  t*i  =  max{2t'o  +  1,3},  to  =  4ki  +  2{lci +  to- 

D\  =  AIcq  +  2,  Ai  =  ti  +  1,  A'o  =  2(ti)^  +  3{ti )“  +  tj  +  1, 

A'3  =  tolQI,  A'4  =  tofQI-  +  (to  +  6)|Q|, 

4  =  2lQ|Card(A^^''). 

We  consider  now  the  integer  sequences  Li,  Si,  Si,  Ei)m<i<m+n-i  defined  by  the  relations  (11,12) 

of  section  5  where  the  parameters  Ki, . . K4  are  chosen  to  be  the  above  constants,  the  functions  F  is 
associated  with  d  =  do  and  m  =  1,  n  =  do- 


L>7  —  LJdo  +  Sdo  ■ 


7  Strategies  for  2?o 

Let  us  define  strategies  for  the  particular  system  X>o. 

We  define  first  auxiliary  strategies  TcutiT^tTg ,Ta,Tb,Tc  and  then  derive  some  closed  strategies  from 
them.  Let  us  fix  here  some  total  ordering  on  X  :  <  X2  <  •  •  •  <  and  also  some  total  ordering  <  of 

type  w  on  >1  (inherited  from  the  usual  well-ordering  of  IN  by  the  fixed  encoding).  From  these  orderings 
one  can  construct  in  the  usual  way  an  ordering  of  type  w  on  the  sets  and  IN*  x  (DRB  <  V  >)". 

Let  us  adapt  the  usual  notion  of  stacking  derivation  to  derivations  of  series.  For  every  u  6  A'*  we 
define  the  binary  relation  |  (u)  over  DB  <  >  by:  for  every  S,  S'  &  DB  <  V'  >,  5  t  i^)S'  ^  3A  G  Z,uj  £ 

>P^Q  E  QAL  E  DBq_i  <V>  such  that 

S  =  [pA]  ♦  \jpA\  O  It  =  [qui] ,  S'  =  [^w]  *  (F. 

A  sequence  of  deterministic  series  So,Si,.  ..,Sn  is  a  derivation  iff  there  exist  xi,. .  .,Xn  E  X  such  that 
SoQ  xi  —  Si, . . .,  Sn-i  © Xn  =  5„.  If  u  =  Xi  -  X2  •  -  •  •  ■  we  call  Sq,Si,  . .  .,Sn  the  derivation  associated 
with  (5,  It).  A  derivation  So,Si, . . . ,  Sn  is  said  to  be  stacking  iff  it  is  the  derivation  associated  to  a  pair 
(5,  It)  such  that  S  —  So  and  So  T  (u)5'n. 

Tcru:  T,,,t{Ai  •  •  -  A^)  =  Si  ■  •  -  5^  iff  3t  €  [1,  n  -  l],35,r, 

=  {Pi ,  S,T),An  —  (pn ,  S,  T),pi  <  pn  and  m  =  0 

Tr-  T,iiAiA2  -An)=Bi--Bm  iff  3S,  T,  A„  =  (p,S,r),p  >  0,5  =  T  =  0  and  m  =  0 
Tgi  Te{Ai  ■  ■  ■  An)  =  Bi  ■  ■  ■  Bm  iff  A„  =  (p,  5,  T),p  >  0,  5  =  T  =  £  and  m  =  0 
Ta-.  T4(Ai.-.A„)  =  Si---S^iff 

An  =  (p,  5,  T),  m  =j  AT  I,  Si  =  (p-f-  1,5©  Xi,T0  Xi),  ■  ■  -  .Bm  =  (p+  1,5©  ^rn.r©  x^), 
where  S  ^  s,T  ^  z 

7'b'-  T'giAi  ■  ■  -  An)  =  Si  •  •  Bm  iff  n  >  fci,  An-t,  =  {■w,U  ,U'),  (where  U  is  unmarked) 

U'  =  ^[pAg]  ■  Vq  (  for  some  (p  G  Q) 
l€Q 

Aj  =  (tt  4-  ki  +  i  -  n,Ui,Vl)  for  n  —  ^ri  <  i  <  n,  {U' i)n~kx<i<n  is  a  "stacking  derivation”  (see  the 
above  definition), 

y'  =  ^  \prq]  ■  Vq ,  for  some  p  6  Q,  r  G  5+ , 

?6Q 

m  =  1,  Si  =  (T  +  iq  -  1, 1/,  K'),  V  =  Un,V'  ^  M  •  (5  0  «,), 

where  Q'  =  [q  ^  Q  \  \pAq]  ^  0},  Vr/  G  Q' ,  u,  =  min(<p([pA^])). 

Tg  -.  Tq  is  defined  in  the  same  way  as  by  exchanging  the  left  series  (5“)  and  right  (S'*")  .series  in 
every  assertion  (p.5“,5‘^). 
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■Bm  iff  there  exists  fi  6  [1,  (^oji  5i,  52, 


.,5d  e  DRB  <V  >,  I  <Ki  <  K2  < 

-,5^. 


Tc:  Tc{A,--An)  =  Bi- 

■  ■  ■  <  Kd  —  n,  such  that, 

(Cl)  every  equation  £{  =  ^4^.  is  a  weighted  equation  over  5i,  So, 

(C2)  5  =  (5i)i<i<<i  fulfills  the  hypothesis  of  lemma  5.2, 

(C3)  (K:i,«o,---,/Crf,Si,-*-,5<i)  6  IN’  X  (DRB  <  V"  >)'  is  the  minimal  vector  satisfying  conditions 
(C1,C2)  for  the  given  sequence  (.Ai  ■  •  •  An)  and 

(C4)  Pe{INV{S))  (  where  pe  is  the  obvious  extension  of  pe  to  pairs  of  series  and  then 

to  sequences  of  weighted  equations;  in  other  words  the  result  of  Tc  is  INV{S)  where  the  marks 
have  been  removed). 

Let  us  notice  that,  by  lemma  5.2  and  lemma  3.9,  for  every  j  e  [1,  m],  i||  Bj  ||j<  rn,+D(5)  +Sm+D(c?)  < 
-  — -  ■  ..  .  ..  ,  ,  .  -! - r  iu.  5,e  lefthand 


=  Do.  This  inequality  is  independant  of  the  sizes  of  the  series  appearing 
sides  (  or  rhs)  of  the  initial  equations  Ai  •  •  •  An- 


Lemma  7.1  :  Tcut,T^,Z,TA,TB,Tc  areVo  strategies. 


Let  us  define  the  strategy  Sab  by  :  for  every  ly  =  Ai  A2  •  ■  •  An, 


(0)  if  W  e  dom(T,«t).  then  5^5 (Vl^)  =  Tcnt{W) 
(2)  elsif  ly  e  dom(r,),  then  5,iB(Vy)  =  T;(iy) 
(5)  elsif  ly  €  dom^r^),  then  Sab{W)  =  Tq{W) 
(7)  else  5^B{iy)  is  undefined. 


(1)  elsif  ly  €  dom(r«i),  then  5.4b(I^)  =  T^iW) 
(4)  elsif  ly  G  dom(T+),  then  Sab{W)  =  T^{W) 
(6)  elsif  VF  G  dom(T^),  then  5ab{W^)  =  Ta{W) 


The  strategy  Sabc  is  obtained  by  inserting  “(3)  elsif  ly  G  dom(Tc),  then  Sabc{W)  ~  Tc{W)  in  the 
above  list  of  cases. 


Lemma  7.2  Sabc  ^  Bab  closed. 


8  Tree  analysis 

This  section  is  devoted  to  the  analysis  of  the  proof-trees  r  produced  by  the  strategy  Sab  defined  in 
section  7.  The  main  results  are  [Sen97,  lemma  8.14  ,  8.15]  whose  combination  asserts  that  if  some^path 
(  from  a  node  r  to  a  node  y)  of  r  is  such  that  its  origin  has  a  “small  norm”  and  its  length  is  “large 
enough” ,  then  the  transformation  Tc  is  defined  at  some  ancestor  of  y.  ^ 

9  Completeness  of  'Do 

Lemma  9.1  ;  5.4BC  «  terminating. 

The  proof  leans  on  the  two  delicate  lemmas  [Sen97,  lemma  8.14  ,  8.15]  mentioned  above. 

Theorem  9.2  The  system  Vq  is  complete. 

Proof:  By  lemma  7.1  Sabc  is  a  strategy  for  Vq,  by  lemma7.2  Sabc  is  closed  ,  by  lemma  9.1  it  is  termi¬ 
nating  and  by  lemma  4.6,  Vq  is  complete.  □ 


Theorem  9.3  The  equivalence  problem  for  deterministic  pushdown  automata  is  decidable. 

Proof:  Let  M  be  some  dpda.  The  equivalence  relation  —  on  DRB  <  V  >  (where  V  is  the  structured 
alphabet  associated  to  the  given  Ad)  has  a  recursively  enumerable  complement  (this  is  well-known).  By 
theorem  9.2  and  lemma  4.2  =  is  recursively  enumerable  too.  Hence  =  is  recursive.  In  addition,  the  system 
Vq  a.ssociated  with  M  is  computable  from  M,  hence  the  theorem  follows.  □ 


^  Technically  speaking,  this  is  the  most  difficult  part  of  the  full  proof;  we  cannot  sketch  it  here  due  to  the  lack  of 
space. 
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Abstract.  We  will  describe  the  recognizable  formal  power  series  over 
arbitrary  semirings  and  in  partially  commuting  variables,  i.e.  over  trace 
monoids.  We  prove  that  the  recognizable  series  are  certain  rational  power 
series,  which  can  be  constructed  from  the  polynomials  by  using  the  oper¬ 
ations  sum,  product  and  a  restricted  star  which  is  applied  only  to  series 
for  which  the  elements  in  the  support  all  have  the  same  connected  al¬ 
phabet.  The  converse  is  true  if  the  underlying  semi-ring  is  commutative. 
Moreover,  if  in  addition  the  semiring  is  idempotent  then  the  same  re¬ 
sult  holds  with  a  star  restricted  to  series  for  which  the  elements  in  the 
support  have  connected  (possibly  different)  alphabets.  It  is  shown  that 
these  assumptions  over  the  semiring  are  necessary.  This  provides  a  joint 
generalization  of  Kleeiie’s,  Schiitzenberger’s  and  Ochmahski’s  theorems. 


1  Introduction 

In  the  theory  of  automata  and  formal  languages,  Kleene’s  foundational  theorem 
on  the  coincidence  of  regular  and  rational  languages  in  free  monoids  has  been 
extended  in  many  ways.  Schiitzenberger  [15]  investigated  formal  power  series 
over  arbitrary  semirings  (e.g.,  like  the  natural  numbers)  and  the  free  monoid, 
i.e.  in  noncommuting  variables,  and  showed  that  the  recognizable  formal  power 
series  coincide  with  the  rational  ones.  This  was  the  starting  point  for  a  large 
amount  of  work  on  formal  power  series,  cf.  [14,9,2,8]  for  surveys.  The  concept 
of  recognizable  formal  power  series  has  also  been  defined  for  arbitrary  monoids 
instead  of  the  free  monoid,  but  it  was  clear  and  has  been  stressed  by  several 
authors  (cf.,  e.g.  [14])  that  in  general  then  the  recognizable  and  the  rational 
series  do  not  coincide. 

On  the  other  hand,  Mazurkiewicz  [10,11]  introduced  an  important  mathemat¬ 
ical  model  for  the  behaviour  of  concurrent  systems:  trace  monoids  (or  free  par¬ 
tially  commutative  monoids),  see  also  [3, 1,4-6]  for  their  well-developed  theory. 
They  are  monoids  whose  generators  are  partially  commutative.  Again,  their  rec¬ 
ognizable  languages  do  not  coincide  with  the  rational  ones,  but  by  Ochmahski’s 

*  This  research  was  partly  carried  out  during  a  stay  of  the  first  author  in  Paris  and 
another  stay  of  the  second  author  in  Dresden. 
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theorem  [12]  they  coincide  with  the  c-rational  languages  where  the  iteration  is 
restricted  to  connected  languages. 

It  is  the  aim  of  this  paper  to  investigate  recognizable  formal  power  series  over 
trace  monoids,  thereby  obtaining  a  generalization  of  both  Schiitzenberger’s  and 
Ochmahski’s  results. 

We  denote  by  K({M))  the  set  of  all  formal  power  series  over  the  semiring  R 
and  the  free  partially  commutative  monoid  M.  It  is  known  that  in  general  the 
recognizable  series  in  K  ((hH))  form  a  proper  subclass  of  the  rational  ones.  We 
therefore  define  the  subclasses  of  c-vationol  and  w,c-vo,tiono,l  series.  We  say  that 
a  series  S  is  comiexteA,  if  each  element  of  its  support  is  connected,  and  S  is  mono- 
alphabetic,  if  all  elements  of  its  support  have  the  same  set  of  generators.  The  c- 
rational  series  are  obtained  from  the  polynomials  by  allowing  the  operations  sum, 
product,  and  star,  but  the  latter  applied  only  to  proper  and  connected  series. 
The  mc-rational  series  are  constructed  in  the  same  way,  but  using  star  only  for 
series  which  are  proper,  mono-alphabetic  and  connected.  In  view  of  Ochmahski  s 
result,  one  might  expect  that  the  recognizable  series  in  K{{M))  coincide  with  the 
c-rational  ones.  However,  we  will  show  that  this  fails  in  general  even  foi  the 
semiring  (N, +,  x).  Our  main  result  is  the  following: 

Theorem  1.  Let  M  he  a  trace  mxnoid  a,nd  K  a  semiring. 

(a)  Each  recognizable  series  in  /i” ((M))  is  m, c-rational. 

(b)  If  K  IS  commutative,  each  m, c-rational  series  in  K{{M))  is  recognizable. 

(c)  If  K  is  comm,utat/ive  and  idem, potent,  each  c-ra,tiona,l  series  in  K{{Mf)  ts 
recognizable. 

The  fact  that  the  recognizable  series  in  K ((M))  are  closed  under  the  product 
operation  was  proved  before  already  by  Fliess  [7],  but  only  for  very  specific 
semirings  K  (strong  Fatou  semirings  or  the  Boolean  semiring).  By  Theorem  1(b), 
this  holds  for  arbitrary  commutative  semirings,  and  we  show  by  example  that 
the  commutativity  of  K  is  needed  for  this. 

Theorem  l(b,c)  is  proved  in  section  3.  There  we  also  show  that  if  the  star  S* 
of  a  recognizable  proper  series  S  is  connected,  then  it  is  also  recognizable.  This 
gives  another  closure  property  of  the  recognizable  series  under  the  star-operation. 
Part  (a)  of  Theorem  1  is  proved  in  section  4,  and  in  section  5  we  give  examples 
and  discuss  the  relationship  with  Schiitzenberger’s  and  Ochmanski’s  results.  For 
lack  of  space,  most  proofs  are  not  contained  in  this  extended  abstract. 

It  seems  a  very  interesting  research  road  to  investigate  which  other  results 
from  the  theory  of  formal  power  series  over  non- commuting  variables  can  be 
extended  to  series  over  partially  commuting  variables,  i.e.  over  trace  monoids. 


2  Background 

Here  we  recall  the  necessary  notation  and  badcground  for  formal  power  series 
and  of  trace  theory.  For  more  details,  we  refer  the  reader  to  [14,2,4,6]. 

Let  M  be  any  monoid  and  K  =  (A',+,-,0, 1)  any  semiring,  i.e.,  (/v,+,0) 
is  a  commutative  monoid,  (A", -,1)  is  ^  monoid,  multiplication  distributes  over 
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addition,  and  0  •  .r  =  .t  •  0  =  0  for  each  x  G  If  multiplication  is  commutative, 
we  say  that  K  is  commutative.  If  the  addition  is  idempotent,  then  the  semiring 
is  called  idem, potent  For  instance,  the  semiring  (R  U  {oo},  min, +,  oo,  0)  is  both 
commutative  and  idempotent. 

Mappings  S  from  M  into  K  are  called  form,al  power  series.  They  are  de¬ 
noted  as  formal  sums  S  =  where  (5,  m)  =  S{m)  G  K.  The  set 

snpp{S)  =  {m  G  M  \  {S,  m)  ^  0}  is  called  the  support  of  5,  and  if  it  is  finite, 
then  S  is  called  a  polynomial  The  collection  of  all  formal  power  series  is  denoted 
by  A'((M)),  and  its  subset  of  all  polynomials  by  K{M).  We  consider  elements 
of  K  also  as  polynomials  in  the  natural  way,  having  a  non-zero  entry  only  at 
1  G  M.  If  L  C  M,  we  define  the  characteristic  series  of  L  by  •  m.. 

Let  77.  >  1  and  [77]  {1, ...  ,  77}.  We  let  be  the  monoid  of  all  (n  x  n)- 

matrices  over  K  (with  matrix  multiplication  as  usual).  A  series  S  G  K{{M)) 
is  called  recognizable,  if  there  exists  an  integer  r?,  >  1,  a  monoid  morphism  /./.  : 
M  — )■  and  vectors  A  G  A"^^”,7  G  such  that 

(S',  m)  =  A  •  (//m)  •  7  =  A,(|/,m),j7,- 

i,je[n\ 

for  each  777  G  M.  In  this  case,  the  triple  (A,//, 7)  is  called  a  representation  of 
S,  and  we  often  shortly  write  S  =  (A,//, 7)  to  denote  this.  If  i,j  G  [n],  we  also 
aldu-eviate  {prn)ij  =:  prnij.  We  let  denote  the  set  of  all  recognizable 

formal  power  series. 

With  componentwise  addition,  K{{M))  becomes  a  commutative  monoid.  Now, 
the  ( Ca,uchy)  prod/uct  of  two  series  S,  S'  in  K{{M))  is  the  series  defined  for  777  G  M 
by  (5  •  S',ni)  =  ^1)  ‘  (*^5  ^2)  provided  the  sum  is  defined  (e.g. 

when  the  sum  is  finite).  With  this,  K{{M))  is  a  semiring.  The  powers  5" (77  >  0) 
are  defined  in  the  natural  way.  We  call  S  proper,  if  (S,  1)  =  0,  and  then  we  put,  in 
the  natural  way,  S*  =  X)n>o  iteration)  of  S,  and  5+  =  Z)n>i 

provided  it  is  defined.  We  let  A"^“^({M))  denote  the  smallest  subset  of  K{{M)) 
which  contains  all  polynomials  and  is  closed  under  the  operations  sum,  product 
and  star,  where  the  latter  is  only  applied  to  proper  series.  Its  elements  are  called 
rn,tional  formal  power  series.  Now  Schiitzenberger’s  theorem  states  the  following 
equivalence  between  recognizable  and  rational  series  over  the  free  monoid. 

Theorem  2  (Schiitzenberger,  [15]).  Let  S  he  a, ny  finite  set  and  K  any  semi¬ 
ring.  Then 

From  this,  Kleene’s  theorem  on  the  coincidence  of  regular  and  rational  lan¬ 
guages  follows  by  considering  the  Boolean  semiring  B  =  {0, 1}  (with  1  +  1  = 
1-1  =:  1)  and  noting  that  a  language  L  C  E*  is  regular  iff  its  characteristic  series 
Ir  €  B({T'’"))  is  recognizable,  and  similarly  for  rationality. 

Later  we  will  also  need  the  Hada,m,ard  product  S  O  T  of  two  series  S,T  E 
K{{M)).  It  is  defined  by  {S  (•)  T,  m)  =  {S,m)  •  {T,inn)  for  all  m.  G  M. 

Next  we  recall  basic  notions  from  trace  theory.  A  pair  {E,  I)  is  called  a  trace 
alphabet,  if  17  is  a  finite  set  and  I  is  an  irrefiexive  symmetric  binary  independence 
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relation  on  U.  Let  ~  denote  the  smallest  congruence  on  containing  {{ab,ba)  : 
a  I  b}.  The  quotient  monoid  M  =  M(T',/)  T*/  ~  is  called  the  trace  monoid 

(or  free  partially  commutative  monoid)  over  (^,/).  If  €  T"*,  we  let  [la]  denote 
the  eqtiivalence  class  of  w  in  M.  Also,  let  be  the  set  of  all  letters  of  S 

occurring  in  rc’,  called  the  alphabet  of  iv.  Since  equivalent  words  have  the  same 
alphabet,  we  may  put  a:([?i;])  ==  a(w;).  If  A,B  C  U,  we  write  A  /  ^  to  denote 
that  a  I  b  for  all  a  e  A,b  £  B.  We  also  write  w  I  A  or  [u;]  I  A  to  abbreviate 
that  I  A,  similarly,  w  I  w'  for  I  a[w')^  etc.  A  subset  A  C  E  \s  called 
connected,  if  it  cannot  be  split  A  =  AUB  into  two  non-empty  subsets  such  that 
A  I  B.  Again,  w  and  [?c]  are  connected,  if  is  connected.  A  language  L  CM 
or  L  C  17*  is  called  connected,  if  each  of  its  elements  is  connected,  and  w,ono- 
alphabetic,  if  a (77?  )  =  o:(77?/)  for  all  777,777/  G  L.  Then  the  collection  of  all  c-rational 
languages  in  M  (respectively,  in  E*)  is  defined  as  the  smallest  set  of  languages 
of  M  (respectively,  of  E*)  containing  all  finite  languages  and  which  is  closed 
under  the  operations  union,  product  and  star,  where  the  latter  is  applied  only 
to  connected  languages.  The  following  characterizes  the  recognizable  languages 
of  M  (recall  that  a  language  L  C  M  is  recognizable  iff  it  is  accepted  by  some 
finite  M- automaton,  or,  equivalently,  iff  its  syntactic  monoid  is  finite). 

Theorem  3  (Ochmahski,  [12,4,6]).  Let  {E,I)  be  any  trace  alphabet  and  M 
itA  trace  monoid.  Then  a  language  L  CM  is  recognizable  iff  it  is  c-rationn,L 

Again,  one  should  note  that  the  Kleene’s  theorem  mentioned  above  is  a  spe¬ 
cial  case  of  Theorem  3  since  when  the  independence  relation  is  empty,  the  trace 
monoid  M(i7, 0)  is  the  free  monoid  i7*  and  in  this  case  all  languages  are  con¬ 
nected,  hence  rational  sets  are  also  c-rational. 

The  goal  of  this  paper  is  a  common  generalization  of  Theorems  2  and  3, 
that  is,  a  characterization  of  the  recognizable  formal  power  series  in  K{{M)) 
where  K  is  a  semiring  and  M  a  trace  monoid.  Let  S  £  7L((M)).  We  say  that  S 
is  connected,  if  supp(S)  is  a  connected  language  in  M,  and  m,ono- alphabetic,  if 
supp(S)  is  mono-alphabetic.  In  the  latter  case,  we  put  a{S)  =  aipm)  if  S'  7^  0 
and  m,  £  snpp[S).  Now  let  (mono-alphabetic-connected  rational) 

he  the  smallest  subset  of  A'((M))  which  contains  all  polynomials  and  is  closed 
under  the  operations  sum,  product  and  star,  where  the  latter  gets  applied  only 
to  proper,  mono-alphabetic  and  connected  series.  Similarly,  we  let  Ji 
(connected  rational)  be  the  collection  of  series  obtained  from  the  polynomials 
hy  allowing  the  operations  sum,  product  and  star,  where  now  star  is  applied  to 
all  proper  and  connected  series.  Similarly,  we  define  connected  series  in  Ki^E*)) 
and  the  collection  of  m, c-rational  series  in  K{{E*)). 


3  Mc-rational  series  are  recognizable 

In  this  section,  let  (17, 1)  be  a  trace  alphabet  and  M  =  M(i7, 1)  its  trace  monoid. 
We  will  prove  Theorem  l(b,c).  This  will  require  a  more  particular  notion  of 
representations  which  we  introduce  first. 
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Definition  4.  Let  S  =  (A,//,-/)  €  K((Mj)  be  a  recognizable  series  with  /i  : 
M  — ^  The  representation  (A,/7,,7)  is  alphabetic,  if  there  exist  two  func¬ 

tions  a,  a  :  [n]  — V{U)  such  that  for  all  n  €  M,  the  following  three  conditions 
are  satisfied:  ^ 

(1)  Whenever  fj.nij  #  0,  then  a{j)  =  a{i)  U  a{v)  and  a{i)  =  a{j)  U  o;(n); 

(2)  whenever  Aj  7^  0,  then  a'(z)  =0; 

(3)  whenever  7j  7^  0,  then  O'.(j)  =  0. 

We  call  (A,//.,7;  o,  a)  an  alphabetic  representation  of  5.  Here,  a{k)  describes 
the  past  alphabet  of  k  and  a{k)  the  future  alphabet  of  k.  We  say  that  k  is  initial, 
if  o  {k)  =  0,  and  k  is  final,  if  a{k)  =  0. 

We  will  often  use  the  fact  that  if  (A,//, 7)  is  alphabetic  and  fiiiij  7^  0,  then  i 
initial  implies  that  a{j)  =  a{n.),  and  j  final  implies  a(?:)  =  a{n).  Aloreover,  if 
?/  7^  1,  then  i  initial  implies  =  0,  and  j  final  implies  fiUjk  =  0,  for  any  k. 

Proposition  5.  Let  S  €  /v  ((M))  be  a  recognizable  series.  Then  there  exists  an 
alphahetic  representation  of  S. 

First  we  want  to  show  that  the  product  of  two  recognizable  series  in  K{('M)) 
is  again  recognizable.  For  more  particular  semirings  K  (strong  Fatou  semirings 
or  the  Boolean  semiring),  the  result  has  been  obtained  already  by  Fliess  [7, 
Prop.  2.2.14  and  2.2.15].  Our  proof  will  not  use  the  full  notion  of  alphabetic 
representation,  since  it  can  be  based  either  on  the  past  alphabets  (the  function 
r>)  or  the  future  alphabets,  only.  The  full  notion  of  alphabetic  representation 
will  come  into  use  when  w^e  deal  with  iteration. 

Theorem  6.  Let  K  be  a  com.wMta.tive  semiring  and  let  -Si,  52  G  A  ((M))  be  two 
recognizable  series.  Then  their  product  S  =  S\  •  S2  is  also  recognizable. 

Proof.  Let  (A\/,/.i,7^)  be  a  representation  of  5i  and  let  (A^,//2,7^;  (y.,(y.)  be  an 
alphabetic  representation  of  52  (Proposition  5).  We  assume  that  //f  :  M  • — > 
^  12,  and  let  n  =  Ui  •  n2.  Subsequently  we  identify  [n]  with 
[rii]  X  [772]-  Next,  we  define  p,  :  U"  ■ — > 

where  ^ 

fl  if?:=i  1  r/  Jl  ifula{i) 

Sii  =  <  and 

1^0  otherwise  [0  otherwise 

Note  that  I{o.,  j2)p2{^)i2ii  =  hence  at  most  one  of  the  two  terms  is  non-zero. 

One  can  prove  that  //(a)  •  //(5)  =  //(??)  •  /i(a)  for  all  {o.,h)  G  /.  Hence,  // 
factorizes  to  a  morphism  p.  :  M  — )■  Next  we  claim  that  this  factorization 

is  given  hy  the  explicit  formula 


u>=uv 
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Finally,  define  A  e  ^  by  ^ikuh)  ~ 

can  verify  that  S  =  (A,//., 7)  which  proves  Mie  theorem. 

The  following  result  shows  that  a  mono- alphabetic  recognizable  series  has 
an  alphabetic  representation  (A,//,  7;  a,  q)  with  an  even  more  specific  form.  For 
this,  let  Cl  =  (1,0, . . . ,  0)  €  A^'^"  and  c„  =  (0, . . .  ,  0, 1)^'  € 


Proposition  7.  Let  S  G  A'{(M))  he  recognizoMe,  proper  and  mono-alphabetie 
with  a(S)  =  A.  Then  there  exists  am,  a,}phabet'i,c  representation  (ci,//, e„;  a,  o) 
of  S  with  0(1)  :=  (yfn)  =  A. 

We  will  now  prove  the  following  essential  closure  property  of  recognizable 
series.  Note  that  Theorem  1(b)  follows  easily  from  Theorems  6  and  8. 

Theorem  8.  Let  K  be  a  commutative  sew,iring  and  let  S  G  A^((M))  be  a  proper, 
connected,  m.ono- alphabetic  and  recognizable  series.  Then,  S*  is  recognizoMe. 


The  proof  of  this  theorem  is  based  on  a  rather  involved  construction.  Let 
S  G  K ((M))  be  a  proper,  recognizable,  connected  and  mono-alphabetic  series 
with  o(S)  -  A.  Let  S  =  (ci, /y.,  o,  a)  be  an  alphabetic  representation  with 
a(l)  =  o(n.)  -  A  (Proposition  7).  Let  m  >  1.  We  identify  [r?,""]  with  the  set  [n]”" 
of  all  m- tuples  with  entries  from  [n].  We  use  7  as  abbreviation  for  such  an  m-jt.uple 
(?i , . . .  ,  ) ,  similarly  J,  k.  Now  we  define  functions  , . . .  ,  /y"*  :  S*  — >  K""  ^ ” 

by 


/  y  ar,j  = 


if  (?:2,...  ,?'m,l) 
otherwise 


fAarj 


10 


if  ji  =  ii  for  all  /  ^ 

otherwise  ~ 


Also,  let 


{1  if  ^(yp)  U  o.(ip)  =  A  =  a{S)  for  all  p,  ^(z'l)  ^  0  and 
a(?‘p)  I  n'(yg)  for  all  p  <  <7 
0  otherwise 


Let  H  G  A’”’"  be  given  by  Hij  =  and  define  fT  :  S*  — >  A  ” 

by  f.i*  =  H  0  (/y°  +  •  ■  ■  +  //""),  where  {H  Q  =  Hjj  •  pTwij  for  any  w  G  N* 

and  7,  j  G  [ri]  ^ . 

Theorem  8  results  clearly  from  the  following  two  essential  results. 


Proposition  9.  Let  K  be  a  cow,w,utative  semiring  and  assum,e  that  m  >  |A|. 
Then  //"(rv.6)  =  iT{ha.)  for  all  o,,b  E  LJ  such  that  a  I  b. 

Hence  /y*  factorizes  to  a  morphism  from  M  to  A^”"”  and  we  have: 
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Proposition  10.  Let  K  be  a  cornmutative  semiring  and  assum,e  that  m,  >  \A\. 
Then  S"  =  (Ai,//.*,7j)  where  Aj,7j  a, re  the  row  respectively  column  vectors  -which 
have  a  1  only  at  entry  1  =  (1,  ■  - .  ,1),  f^'T^'d  0  otherwise. 

Next  we  wish  to  derive  a  further  closure  properties  of 


Definition  11.  Let  S  G  or  S  G  K{{S*))  and  ACS,  Then  the  restric¬ 

tion  of  S  to  A  is  the  series  Sa  defined  by 


(S',?/;)  if  o (?/;)  =  A 

0  otherwise 


First  we  show  that  the  restriction  preserves  both  recognizability  and  mc- 
rationality  of  series. 

Proposition  12.  Let  S  G  ^'((M))  be  recognizable.  Then  Sa  recognizable. 

Proposition  13.  Let  S  G  K{{S’^)}  or  S  e  K{{Mf)  be  m,c-rational.  Then  Sa 
also  rnc-i'ational. 


The  following  lemma  generalizes  a  result  of  Pighizzini  [13]  for  trace  languages. 

Lemma  14.  Let  S  G  K{{^)  proper  and  A  C  S  be  nonempty.  Then  {S*)a  = 
Z^X  -where  X  =  Ebca(^*)b  o.nd  Z  =  (X  •  S)^. 

Next  we  derive  another  sufficient  condition  which  implies  that  the  star  of  a 
recognizable  series  is  again  recognizable  and,  also,  that  the  star  of  an  mc-rational 
series  is  again  mc-rational. 


Theorem  15. 

1.  Let  K  be  a  conim/ataM-ve  semiring  and  S  G  be  proper  and  recognizoLle 

such  that  S’"  is  connected.  Then  S*  is  recognizable. 

2.  Let  K  be  a,ny  .semiring  and  S  G  K{{X*))  or  S  C  K{{M})  be  proper  and  m,c- 
rational  .such  that  S*  is  connected.  Then  S*  is  m.c-rational. 

For  positive  semirings,  the  condition  S*  connected  is  stronger  than  S  con¬ 
nected.  This  latter  condition  is  actually  sufficient  to  obtain  the  closure  properties 
stated  in  Theorem  15  when  the  semiring  is  commutative  and  idempotent.  This  is 
an  easy  consequence  of  Theorem  1(a)  and  of  Theorem  17  for  which  the  following 
lemma  is  crucial. 

Lemma  16.  Let  K  be  a  comm, uta, five  and  idem, potent  sem,iring.  Let  S  G  X{{M)) 
be  a,  connected,  series  o,nd.  let  B^C  C  S  be  independent  subsets  of  the  alpha, bet. 
Then,  {S"')buc  =  {S*)b  ‘  (*5'*)c- 

Theorem  17.  Let  K  be  a  com,m,uta,tive  a,nd  idem, potent  sem,iring.  A  series  in 
A'({M))  is  m,c-ra,t-iona,l  iff  it  is  c-ra,tiona,l. 
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Proof.  One  direction  is  clear  and  for  the  converse,  it  s\iffices  to  show  that  the 
star  of  an  me- rational  connected  series  5  is  still  mc-rational  We  will  first  show 
by  induction  on  the  size  of  A  C  that  if  S  is  an  mc-rational  connected  series 
then  (S")a  is  mc-rational.  The  theorem  follows  directly  since  5**  ~ 

Clearly,  (5")0  =  1  is  mc-rational.  Now,  assume  A  ^  ^  and  let  Ai,...  ,A„ 
be  the  connected  components  of  A:  A  =  Ai  U  •  •  •  U  A„  and  Ai  I  Aj  for  i  ^  j. 
By  Lemma  16,  we  obtain  (5*)yi  ==  {S*)a,  -■{S*)Ar.  and  we  are  reduced  to 
the  case  A  connected.  Now,  using  Lemma  14  we  obtain  (*S'*)/\  =  Z'^X  where 
Z  =  {X  ‘  S)a‘  Then  X  is  mc-rational  by  induction 
hypothesis.  By  Proposition  13,  it  follows  that  Z  is  also  mc-rational.  Since  we 
have  assumed  A  connected,  we  deduce  that  {S*)A=^Z-Z*-Xis  mc-rational. 

Note  that  Theorem  1(c)  follows  from  Theorem  1(b)  and  Theorem  17. 

4  Recognizable  series  are  mc-rational 

Thoughout  this  section,  let  K  be  an  arbitrary  (possibly  non-commutative)  semi¬ 
ring  and  (N’,/)  a  trace  alphabet.  We  will  prove  that  all  recognizable  series  in 
((M))  are  mc-rational.  This  uses  the  concept  of  lexicographic  normal  forms  of 
traces  and  LNF-representations  of  series  which  we  introduce  first.  For  this,  fix 
any  linear  order  <  on  S.  \W  extend  this  to  the  lexicographic  linear  order,  also 
denoted  by  <,  on  .  We  say  that  a  word  w  is  the  lexicographic  normal  form 
of  [?c],  if  it  is  the  smallest  element  of  [?i;]  with  respect  to  <.  Then  LNF  is  the 
set  of  all  words  which  are  lexicographic  normal  forms.  Note  that  LNF  is  closed 
under  prefixes  (and  suffixes).  Now  let  Alnf  =  (0?  T", <5,(20,  Q)  fTe  minimal 
(reduced)  automaton  for  LNF. 

Definition  18,  \W  will  call  a  morphism  //  :  ^  an  LNF-morphism, 

if  there  exists  a  function  tt  :  [n]  — )■  Q  such  that  for  all  o.  ^  E  and  all  i.,j  €  [n], 
ftciij  ^  0  implies  7T{i)  7r(j)  in  Alnf-  Then  any  representation  (A,//, 7)  with 

an  LNF-morphism //.  of  a  series  S  G  K{{X")}  will  be  called  an  LNF-representation 
of  5. 

Proposition  19.  Let  G  K{{E*))  he  recognizable.  Then  S  —  S'  Q  Ilnf  h.as  an 
LNF  -  rep  7’C.s  e  nta  ti  0  n . 

Next  we  note  that  for  any  n  >  1  there  is  a  canonical  isomorphism  ^  between 
the  semiring  of  n  x  n-matrices  and  the  semiring  of  formal  power 

series  JG "^'^((T’*)),  given  by  {0(A),  n))  =  {{Aij,w))  if  A  =  (A^j)  G  R  . 

Subsequently,  we  will  often  identify  A  with  its  image  0{A). 

We  will  also  use  the  following  result. 

Lemma  20  (Ochmanski,  [12,4]).  Let  w  e  E""  be  a  word  such  that  u),uP-  G 
LNF,  Then  w  is  connected. 

Proposition  21.  Let  fi.  :  E^  — ^  j^nxn  ^  LNF -rn,orphisw,,  a,nd  let  M  - 

€  R^^^(E*).  Then  the  entries  of  M""  ore  mr-raiional  series. 
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Proof.  We  first  show,  by  induction  on  the  length  of  that  {M*,w)  =  fiw  for 
any  word  w.  Indeed,  clearly  (M*,  1)  =  1  =  //I  and  {M*,wa)  =  (l-{-M*M,wa)  = 
(M"M,wa)  —  (M*,?c)(M,a)  =  fnv  ■  fia  =  fjfrva). 

By  lack  of  space  we  only  give  the  proof  for  n  =  1,  whicJi  already  shows  several 
connections  between  all  the  results.  Hence,  assume  that  n  =  1.  Then  M  G  K{S*) 
is  proper  and  mc-rational.  Now,  let  G  X'*.  If  (Ad*^w)  =  fj.w  ^  0,  since  /./,  is  an 
LNF-morphism,  we  have  a  path  7r(l)  hi  Therefore,  G  LNF 

and  by  Ochmaiiski’s  lemma  20,  w  is  connected.  Hence  M*  is  connected  and  so, 
by  Theorem  15,  mc-rational. 

Theorem  22.  Let  S  G  he  recognizable.  Then  S  O  Ilnf  ^  rn,c-raiional. 

Proof.  By  Proposition  19  we  can  choose  an  LNF-representation  (A,//,  7)  of  S’  = 
S  0  Ilnf-  Let  M  =  proof  of  Proposition  21 

that  —  fiw  for  any  word  ?/;. 

Now,  A  and  7  are  vectors  with  entries  in  JF,  and  M*  has  only  mc-rational 
series  as  entries  by  Proposition  21.  Hence  XAL*'^f  G  K{{S*))  is  an  mc-rational 
series.  Finally,  observe  that  for  any  word  w, 

ij  hj 

-  =  Xfiun  =  {S’,w). 

hj 

Therefore  5"  G)  Ilnf  =  S'  =  AM* 7  is  mc-rational. 

Corollary  23.  Let  S  G  K{(S*))  he  recognizable  with  snpp{S)  C  LNF.  Then  S 
M  nic~rntiona,l. 

Let  AL,N  be  two  monoids  and  h  :  AT  — >  N'  be  a  morphism.  Then  : 
K{(N))  — >  It.' ({AT))  given  by  {h~'^{S),w)  =  {S,h{w))  (w  G  N)  is  a  semiring  mor¬ 
phism.  Moreover,  if  5’  =  (A,//,  7)  G  then  lh~^{S),ru)  —  (5*,  h(?L’))  — 

Xfi.h{w)y,  hence  (cf.  [14,  p.32]) 

h-HS)  =  {Kpoh,j)eIW^^{{AT)). 

Let  p  :  X*  — >  M  be  the  canonical  epimorphism.  Then  p  extends  naturally 
to  a  mapping,  denoted  by  from  K{{U*))  to  /i  ((M))  given  by 

^{'S')  =  T  =  X  (  X  (‘S'.woj-'- 

to£S*  /,€M  / 

As  is  well-known  from  general  results  (cf.,  e.g.,  [14,  pp.  13,14]),  ^  is  a  semiring 
morphism  and  if  S  is  proper,  then  #(S'*)  =  ^(S)*.  Furthermore,  if  S'  is  connected 
(respectively,  mono-alphabetic),  then  ^(S)  is  also  connected  (respectively,  mono- 
alphabetic).  From  this,  it  is  clear  that  if  S  is  mc-rational,  then  ^(S)  is  also 
mc-rational.  Now  we  prove  Theorem  1(a). 
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Theorem  24.  Let  S  G  /v  ((M))  be  recognizable.  Then  S  is  nic-rational. 

Proof.  Let  S  =  (A,//,"/)  6  As  noted  before,  9p“'(5)  €  K’’^'^{{S‘)). 

By  Theorem  22,  ©  ll.NF  is  mc-rational.  Hence  also  ${(p~^(S)  ©  1t,nf)  is 

mc-rational.  Now  for  each  f  G  M  we  have 

(:)  1lnf),0  =  X]  ® 

=  J2  (<P~HS)po)=  Y1  (5,^(«;))  =  {5,0. 

(<}nLNF  LuGi^-MOnFNF 

Therefore,  S  =  G)  Ilnf)  is  mc-rational. 

5  Examples  and  consequences 

Here  we  will  give  two  examples  to  show  that  the  assumptions  in  Theorems  6  and  8 
(hence,  in  Theorem  l{b,c))  are  necesssary.  We  also  indicate  the  relationship  with 
the  results  of  Schiitzenberger  and  Ochmahski.  First,  we  show  that  in  Theorem  6 
the  commutativity  of  K  is  necessary. 

Example  25.  Consider  the  trace  alphabet  {E.,1)  with  E  =  I 

let  A'  =  l(N^).  Let  A  =  J].  ^  Then  5*  and  T  are 

recognizable.  Indeed,  if  /v.  :  E*  — >  K  is  defined  by  //(a)  =  a  and  /,/,(6)  —  0  and 
X  =z  j  =:  1,  then  5  =  (A,//,  7).  However,  we  can  show  that  S  -T  E  A"{(M)}  is  not 
recognizable. 

Secondly,  we  want  to  show  that  in  general  is  properly  contained  in 

A'^“^°'((M)).  That  is,  we  show  that  the  star  of  a  connected  recognizable  series 
may  not  be  recognizable.  (Thus  by  Theorem  15,  the  star  of  this  series  will  not 
be  connected.) 

Example  26.  Again  consider  the  trace  alphabet  (T’,  I)  with  E  =  {a,  h}  and  a.  I  6, 
and  let  S  =  a.  b  E  N{M).  Then,  obviously,  S'  is  a  connected  polynomial  and 
{Sft)  =  (*'' for  alH  e  M  Hence,  S”  =  En.mSN 
that  S"  is  not  recognizable. 

Let  T  be  any  finite  alphabet.  If  /  =  0,  the  trace  monoid  M(T',  /)  is  isomorphic 
to  A- .  Hence,  by  Theorem  24  we  have  {{E‘))  C  Kr^c-rat g  {{£•)). 

Now,  using  one  inclusion  of  Theorem  2,  we  obtain —  j^mc  _ 

K^'^’  {{E*))  which  is  in  fact  a  strengthening  of  Theorem  2. 

Now  we  show  how  to  deduce  and  actually  strengthen  Theorem  3  from  our 
results.  The  following  can  be  proved  in  the  same  way  as  classically  for  the  free 
monoid  (cf.  [14,2]). 

Proposition  27.  ACM  is  recognizable  (resp.  rational,  c-rational,  mc-rational) 
ifflr  G  1((M))  IS  recognizable  (resp.  rational,  c-rational,  m,c-ra,tional) . 
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Since  the  boolean  semiring  B  is  both  commutative  and  idempotent,  we  de¬ 
duce  from  Theorem  1  that  a  series  in  B{(M))  is  recognizable  iff  it  is  c-rational  iff 
it  is  mc-rational.  Using  Proposition  27,  we  deduce  that  a  trace  language  L  C  M 
is  recognizable  iff  it  is  c-rational  iff  it  is  mc-rational.  The  first  equivalence  is 
precisely  Ochmahski’s  theorem.  The  second  one  is  a  strengthening  of  a  result 
by  Pighizzini  [13]  which  characterizes  the  recognizable  languages  as  those  lan¬ 
guages  obtained  from  finite  sets  of  traces  using  union,  concatenation,  restriction 
to  suhalphabet  and  star  restricted  to  monoalphabetic  and  connected  languages. 
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Abstract.  We  solve  a  conjecture  of  J.  Shallit  related  to  the  automaticity 
function  of  a  unary  language,  or  equivalently  to  the  first  occurrence  function 
in  a  symbolic  sequence.  The  answer  is  negative:  the  conjecture  is  false,  but 
it  can  be  corrected  by  changing  the  constant  involved.  The  proof  is  based 
on  a  study  of  paths  in  the  Rauzy  graphs  associated  with  the  sequence. 


1  Introduction 


In  a  recent  paper  [6],  Shallit  proposed  a  conjecture  on  the  automaticity  func¬ 
tion  of  a  unary  language,  i.e.  the  size  of  the  minimum  finite-state  machine  that 
correctly  decides  membership  in  the  language  for  words  of  length  at  most  n. 
See  [9]  for  more  details  on  the  automaticity  function  and  its  applications;  in 
short,  it  measures  how  close  the  language  is  from  a  regular  language.  The  con¬ 
jecture  arises  from  a  natural  question:  apart  from  regular  languages  (which  have 
bounded  automaticity),  what  is  the  lowest  possible  automaticity  that  a  language 
can  have?  Shallit  rephrased  his  conjecture  in  combinatorial  terms  as  follows: 


Conjecture  1.  Let  u  =  U1U2U3  . . .  be  an  infinite  word  over  a  finite  alphabet 
that  is  not  ultimately  periodic.  Define  S{n)  to  be  the  length  of  the  longest  suffix 
of  U1U2. . .  Un+i  that  is  also  a  factor  0/  W1U2  •  •  •  Then 

liminf  ^  <2- >f=  .381966 

n— >00  n  ^ 

where  =  (1  +  V^)/2  ~  1.61803  is  the  golden  ratio. 

He  also  proved  that  if  it  is  true,  then  this  conjecture  is  optimal  as  the  value  2-(p 
is  attained  for  the  famous  Fibonacci  word, 

0100101001001010010100100101001001010010100100101001010 . . . 

which  is  the  fixed  point  of  the  substitution  0  •->  01,  1  1-^  0. 

Allouche  and  Bousquet-Melou  [1]  noticed  a  similarity  between  this  conjecture 
and  an  older  conjecture  of  Rauzy  [7],  also  involving  the  golden  ratio: 


Conjecture  2.  Let  u  be  an  infinite  word  over  a  finite  alphabet  that  is  not  ul¬ 
timately  periodic.  Let  R{n)  be  the  recurrence  function  of  n,  i.e.  the  size  of  the 
smallest  window  containing  an  occurrence  of  every  factor  of  u  of  length  n  what¬ 
ever  its  position  on  u,  or  00  if  no  such  window  exists.  Then 


lim  sup 


R{n] 


'>  ip  2  — 


5  +  \/5 


-  3.61803 


n 


2 
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They  proposed  a  modified  (“Rauzy-like”)  conjecture,  and  proved  that  it  was 
equivalent  to  Shallit’s  conjecture: 


Conjecture  3.  Let  u  he  an  infinite  word  over  a  finite  alphabet  that  is  not  ulti¬ 
mately  periodic.  Let  R'{n)  he  the  length  of  the  shortest  prefix  of  u  containing  an 
occurrence  of  every  factor  of  u  of  length  n.  Then 


lim  sup 

n—^oo 


R'{n) 


n 


3  +  \/5 


~  2.61803 


Using  Rauzy  graphs,  we  have  been  able  to  prove  Conjecture  2  [3].  We  then 
tried  to  adapt  the  proof  to  Conjecture  3.  In  principle.  Conjecture  3  should  have 
been  easier  to  prove  in  this  way  than  Conjecture  2,  as  the  constant  is  smaller 
and  the  number  of  different  cases  to  study  is  therefore  reduced.  However  we  did 
not  succeed  in  this  attempt,  and  we  resolved  to  first  restrict  to  the  case  of  Stur- 
mian  words,  which  we  had  previously  dismissed  as  trivial,  following  Allouche  and 
Bousquet-Melou:  “[...]  the  case  of  the  Sturmian  words  [...]  can  certainly  be  ad¬ 
dressed  by  adapting  the  arguments  of  [5]  for  the  computation  of  lim  sup  i?(n)/n, 
but  we  have  not  written  the  details.”  We  did  not  try  to  use  the  method  of  Morse 
and  Hedlund  [5]  which  is  specific  to  Sturmian  words,  but  our  general  method 
with  (pointed)  Rauzy  graphs.  And  it  appeared  that  contrarily  to  what  we  ex¬ 
pected,  the  Fibonacci  word  is  not  optimal  for  R'{n)/n.  Indeed,  the  infinite  word 

Z3  =  0100101001001001010010010100100100101001001 . . . 


defined  as  the  fixed  point  of  the  substitution  0  01001010,  1  010  satisfies 


lim  sup 

n—^oo 


R'(n)  29  -  2\/l0 


~  2,51949  <  V3+  1 


n  9 

Conjectures  1  and  3  are  therefore  false.  However,  we  are  now  able  to  prove  a 
modified  conjecture,  with  a  different  constant: 


Theorem  1.  Let  u  be  an  infinite  word  over  a  finite  alphabet  that  is  not  ulti¬ 
mately  periodic.  Let  R'(n)  be  the  length  of  the  shortest  prefix  of  u  containing  an 
occurrence  of  every  factor  of  u  of  length  n.  Then 

R'{n)  29  -  2v/l0 


lim  sup 

n—^oo 

and  this  value  is  optimal 


2.51949 


Fortunately,  Allouche  and  Bousquet-Melou  [1]  proved  much  more  than  the 
equivalence  of  Conjectures  1  and  3:  they  proved  that  the  numbers  lim  inf  S{n)/n 
and  lim  sup  R'{n)/n  are  inverses  of  each  other.  Therefore,  we  immediately  deduce 
a  modified  version  of  Shallit’s  conjecture,  where  the  constant  is  optimal  for  the 
same  Sturmian  word  Z3  as  above: 


Corollary  1.  Let  u  =  U1U2U3  ...  be  an  infinite  word  over  a  finite  alphabet  that 
is  not  ultimately  periodic.  Define  S{n)  to  he  the  length  of  the  longest  suffix  of 
U1U2  ■  ■  •  Un+i  that  is  also  a  factor  of  U1U2  .  • .  Un.  Then 


lim  inf 

n-i-oo 


S{n)  ^  29  +  2%/io 


< 


n 


89 


~  .396905 


and  this  value  is  optimal. 
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In  Section  2,  we  define  precisely  the  tools  that  we  will  use  in  the  proof.  We 
then  study  in  Section  3  the  case  of  Sturmian  words,  and  the  word  zs  occurs 
naturally  in  this  process.  Finally,  we  explain  in  Section  4  how  the  general  case 
can  be  reduced  to  the  Sturmian  case. 

2  Preliminaries 

2.1  Complexity  and  First  Occurrence  Functions 

Let  S  be  a  finite  alphabet,  and  the  set  of  one-way  infinite  sequences  over 
E.  If  u  =  U1U2U3...  is  an  element  of  E^,  and  n  is  a  non-negative  integer, 
we  denote  by  Fn(u)  the  set  of  factors  (also  called  subwords)  of  length  n  of 
u,  i.e.  of  words  of  length  n  consisting  of  letters  occurring  consecutively  in  u: 
Fn(u)  -  {ukUk+iUk+2  .  --Uk+n-i  |  ^  >  l),  and  we  denote  by  F{u)  the  union  of 
these  sets.  We  also  denote  by  pref^(u)  the  prefix  of  length  n  of  u,  i.e.  the  word 

UiU2^--Un- 

The  complexity  function  of  u  is  then  defined  as  the  function  mapping  a  non¬ 
negative  integer  n  to  the  number  of  factors  of  length  n  of  u:  Pu(^)  — 

When  there  is  no  ambiguity  on  the  sequence  u,  we  shall  write  p{n)  instead  of 
Pu(n).  It  is  clear  that  for  all  n  >  0,  1  <  p{n)  <  (#E)^;  moreover,  it  is  well-known 
that  p{n)  >  ?r  +  1  when  the  sequence  u  is  not  ultimately  periodic  [5]. 

To  study  Shallit’s  conjecture,  we  will  use  the  first  occurrence  function  iu  (or 
simply  i)  defined  as  follows.  For  any  word  w  G  F(u),  let  i{w)  be  the  smallest 
positive  integer  m  such  that  w  =  UmUm+i  •  ■  ^Um+\w\~i^  so  that  for  instance 
£(pref^(u))  =  1,  and  let  i{n)  =  maix{i{w)  \  w  €  Fn(u)}. 

Proposition!.  The  function  R'  defined  in  Conjecture  3  satisfies  the  relation 
R'{n)  =  i{n)  +  n  -  1. 

Proof  The  function  R'{n)  is  defined  as  the  length  of  the  shortest  prefix  of  u 
containing  every  factor  of  length  n  of  u.  A  factor  w  e  Fn{u)  occurs  in  pref^(u) 
if  and  only  if  i(w)  <  m  -  (n  -  1),  therefore  pref^(u)  contains  all  factors  if  and 
only  if  111  >  i{n)  +  —  1.  ^ 

Defining  A(u)  =  lim  sup  £(n)/n,  we  get  as  a  corollary  that 

limsup  — -  =  A(u)  -t- 1  . 

n-4oo  ^ 

Proposition  2.  The  first  occurrence  and  complexity  functions  satisfy  the  in¬ 
equality  £{n)  >  p{n). 

Proof  For  two  distinct  factors  v  and  w  of  the  same  length  n,  £(u)  and  i{w)  are 
two  distinct  positive  integers.  The  set  {i{w)  \  w  6  Fn(u)}  contains  therefore  p(n) 
distinct  positive  integers,  hence  its  maximum  £{n)  is  at  least  p(n).  □ 

If  the  sequence  u  is  ultimately  periodic,  then  it  is  easy  to  see  that  the  function 
t  has  a  finite  limit  (it  is  the  minimum  value  of  |uu|,  where  u  and  v  are  words 
such  that  u  =  hence  A(u)  =  0.  Otherwise,  the  complexity  is  at  least  n  + 1 
[5],  therefore  £(n)  >  n  -f  1  by  Proposition  2,  and  A(u)  >  1.  Theorem  1  says  that 
in  fact  A(u)  >  (20  -  2v^)/9  -  1.51949. 
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2.2  Rauzy  Graphs 

To  study  the  structure  of  the  factors  of  a  sequence  u,  it  is  usually  convenient  to 
define  a  sequence  of  graphs  Gn,  called  Rauzy  graphs  or  factor  graphs,  as  follows. 
For  any  non  negative  integer  n,  let  Gn  be  the  directed  graph  with  p{n)  vertices 
labelled  with  elements  of  Fn(u),  and  with  an  edge  from  w  to  u  if  and  only  if 
there  exist  two  letters  x,y  £  T,  such  that  uy  =  xv  £  F^+i(u).  The  graph  G^  has 
therefore  p(?i  +  1)  edges. 

Unlike  other  problems  for  which  only  F{u)  is  important,  for  Shallit’s  con¬ 
jecture  we  need  to  know  which  factors  occur  first  in  the  sequence.  We  shall  add 
this  information  to  the  Rauzy  graphs  by  singling  out  one  vertex,  the  one  labelled 
with  the  prefix  of  length  n  of  u.  We  will  therefore  consider  the  pointed  Rauzy 
graph  (Gn, pref^(u)). 

We  choose  to  label  edges  of  Gn  with  letters,  in  the  following  way:  if  uy  =  xv 
with  x,y  6  S,  then  the  edge  {u,v)  is  labelled  with  the  letter  x.  We  then  define 
the  label  of  a  finite  path  of  length  k  in  Gn  as  the  word  of  length  k  obtained  by 
concatenating  the  labels  of  the  edges  in  the  order  they  are  met,  and  similarly 
the  label  of  an  infinite  path  as  an  infinite  word. 

With  this  definition,  there  is  a  unique  infinite  path  in  Gn  labelled  with  u  and 
starting  in  pref,^(u):  it  is  the  path  • .  • )  where  Wk  is  the  A:-th  block 

of  length  n  of  u,  i.e.  Wk  =  UkUk+i  •  •  • 'Wfc+n-i  (in  particular,  wi  =  pref„{u)). 
Knowing  this  path,  we  can  now  read  i{n)  on  the  graph. 

Propositions.  Let  (lOi, W2, •  •  • )  he  the  path  labelled  with  u  in  Gn-  Then 
£(n)  -  1  is  the  length  of  the  shortest  prefix  of  this  path  that  goes  through  every 
vertex  of  Gn,  and  i{n  +  1)  the  length  of  the  shortest  prefix  of  this  path  that 
goes  through  every  edge  of  Gn- 

Proof  For  a  given  w  E  Fh(n),  we  have  i{w)  =  min{A:  >  I  \  Wk  =  iw}.  Conse¬ 
quently,  a  prefix  {wi,W2,  -  -  -  ,'a)k)  of  length  A;  —  1  of  the  path  labelled  with  u 
goes  through  the  vertex  w  if  and  only  li  k  >  i{w),  and  it  goes  through  every 
vertex  if  and  only  if  A;  >  l{n).  Similarly,  for  a  given  edge  {u,v)  labelled  with 
:r  6  £,  we  have  t{xv)  =  min{A:  >  1  |  w/,  =  “u  and  Wk+i  =  v}.  Consequently,  a 
prefix  {wi,W2,---  ^Wk+i)  of  length  k  of  the  path  labelled  with  u  goes  through 
the  edge  (u,u)  if  and  only  if  A:  >  £{xv),  and  it  goes  through  every  edge  if  and 
only  if  A:  >  i{n  +  1).  C 

It  should  be  noted  that  in  the  graph  Gn,  every  vertex  has  outdegree  at  least 
one  (i.e.  has  at  least  one  outgoing  edge),  and  every  vertex  except  possibly  the 
one  labelled  with  pref^(u)  has  indegree  at  least  one.  The  sequence  is  said  to  be 
recurrent  if  every  factor  occurs  infinitely  often;  in  this  case  pref,^(u)  has  also 
indegree  at  least  one.  If  u  is  not  recurrent,  then  for  n  large  enough  the  prefix 
pref,,^(u)  occurs  only  once,  and  therefore  the  corresponding  vertex  in  Gn  has 
indegree  zero. 

A  vertex  v  of  Gn  is  called  bispecial  if  both  its  indegree  and  its  outdegree  are 
greater  than  1  (the  word  v  is  then  a  bispecial  factor  of  u  [4]). 
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2.3  From  Gn  to  Gn-^i 

The  reader  is  warmly  encouraged  to  construct  the  Rauzy  graphs  Gn,  for  small 
n,  for  a  simple  sequence  like  the  Fibonacci  word,  to  get  acquainted  with  the 
manipulation  of  these  graphs.  One  crucial  point,  on  which  the  rest  of  this  article 
relies  heavily,  is  the  relation  between  Gn  Gn+i ,  which  is  explained  in  detail 
in  [2,  4,  8]  and  summarized  below. 

Knowing  Gn,  one  constructs  its  line  graph  i^(Gn)  as  follows:  for  every  edge 
(u,u)  labelled  with  :r  in  Gn,  there  is  a  vertex  in  D(Gn)  labelled  with  and  for 
every  pair  of  consecutive  edges  {(u,u),  (u,i(;))  in  Gn  labelled  with  x  and  y,  there 
is  an  edge  {xv^yw)  in  D{Gn)  labelled  with  x. 

Proposition  4.  The  Rauzy  graph  of  ordern+l  is  a  subgraph  of  T>(Gn).  Namely: 

—  If  Gn  has  no  bispecial  vertex,  then  Gn+i  =  i^(Gn). 

—  //Gn  has  bispecial  vertices,  then  some  (possibly  none)  edges  {xv,vy),  with  v 
hispecial,  have  to  be  removed  from  D{Gn)  to  obtain  Gn+i- 


3  The  Sturmian  Case 

In  this  section,  we  assume  that  u  is  a  Sturmian  sequence,  i.e.  a  sequence  with 
complexity  p{n)  =  n  +  l.  As  p(l)  2,  the  alphabet  S  has  only  two  letters.  Rauzy 
graphs  of  Sturmian  sequences  are  described  by  the  following  proposition  [2]. 

Proposition  5.  If  u  is  a  Sturmian  sequence,  then  the  Rauzy  graphs  are  of  one 
of  the  following  two  types  (vertices  with  indegree  and  outdegree  1  are  not  repre¬ 


sented). 


Moreover,  both  types  occur  infinitely  often. 


We  shall  give  a  particular  importance  to  graphs  of  the  second  type,  which  we 
number  Gno  -  Go,  Gn,:  Gn^:  etc.  Adding  the  initial  vertex  pref^(u)  (marked 
with  a  black  triangle),  we  get  the  following  pointed  graph  Gn,r 


(1) 


The  three  branches  are  labelled  with  the  words  ajt,  bk,  Ck  (fo  the  case  where 
the  initial  vertex  is  also  the  bispecial  one,  Ok  is  the  empty  word  and  the  loop 
labelled  with  bk  is  the  first  one  used  in  a  path  labelled  with  u).  They  satisfy 
p(nk  +  1)  =  n/c  +  2  =  \akbkCk\‘ 

We  are  now  interested  in  the  evolution  of  the  graphs  when  n  grows  from  Uk 
to  Uk+l- 
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Proposition  6.  For  every  k,  the  transition  between  and  Gnf,^i  is  of  one  of 
the  following  three  types: 

transition  A:  n/c-j-i  —  ti/j  -I-  —  ^k}  —  ^kl^kCk j 

transition  B:  nk-^\  —  n^  4"  |Q.A;6/j;|,  fl^+i  —  ^/c+i  —  ^k^k)  (^k-\-i  —  ^k^k) 

transition  C.  —  rik  4"  —  CkOk^  ^k-\-i  —  ^k-{-i  c^k- 

Proof.  We  have  to  construct  the  graphs  Gn  for  Uk  <  n  <  nk+i,  using  Propo¬ 
sition  4  repetitively.  Let  w  denote  the  bispecial  factor  of  length  Uk-  Let  rr,  y,  z 
respectively  denote  the  last  letters  of  ak,bk^  Ck  (note  that  y  ^  z).  The  line  graph 
D{Gn^.  )  is  then 


To  obtain  which  is  a  graph  of  type  (i),  one  of  the  two  dotted  edges 

has  to  be  removed  from  (2).  (Note  that  the  other  two  central  edges  cannot 
be  removed  because  the  resulting  graphs  would  only  have  ultimately  periodic 
paths.)  Therefore,  is  either 


Then  the  next  graph  (n  =  Uk  A-  \bk  \  4- 1)  depends  on  which  branch  contains  the 
prefix.  There  are  therefore  two  subcases, 
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which  then  evolves  to  G'n;.+a ,  at  Uk+i  —  rik  \akbk\  (transition  A): 

(7) 

and 


(8) 

which  then  evolves  to  also  at  rik+i  ==  n/e  +  \akbk\  (transition  B): 


In  the  second  case,  the  following  graphs  have  the  same  morphology  until 
n  —  nk  +  Icfcl,  where  we  get  directly  (transition  C): 

(10) 

We  observe  that  there  are  three  possible  transitions,  corresponding  to  the 
three  transformations  A,  B,  and  C. 

Proposition  6  allows  us  to  define  a  new  representation  of  the  sequence  u.  Let 
A  be  the  alphabet  A  =  {A,  5,(7};  the  adic  representation  of  u  is  the  sequence 
t  tit2t3  . . .  G  A"^,  where  tk  indicates  which  kind  of  transition  occurs  between 
Gn,_,  and  (7^,.  The  adic  representation  is  related  to  similar  representations 
studied  in  [10],  and  also  (in  the  case  of  Sturmian  words  only)  to  the  usual 
continued  fraction  expansion  of  real  numbers  [5]. 

Proposition  7.  Given  a  sequence  t  G  A^  \  A*(A‘^  U  C^),  there  exists  a  unique 
Sturmian  sequence  u  (up  to  renaming  of  the  letters)  such  that  t  is  the  adic 
representation  of  u. 

Proof  We  first  take  the  graph  Gno  =  Go  to  be  the  graph  with  one  vertex  and 
two  loops  of  length  1,  labelled  with  the  two  letters  of  S  {bo  will  be  the  first 
letter  of  u,  and  cq  the  other  letter),  and  ao  =  e  since  the  starting  vertex  is  the 
bispecial  one.  Then  6^,  and  Ck  are  entirely  defined  by  the  sequence  t,  using 
the  recurrence  relations  of  Proposition  6.  We  thus  know  the  labels  of  the  edges 
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of  the  graphs  G^t,, ,  and  from  this  information  we  can  also  find  the  labels  of  the 
vertices  (the  label  of  a  vertex  in  Gn  is  the  label  of  any  path  of  length  n  starting 
at  this  vertex).  In  particular  we  obtain  the  words  pref^^(u),  which  as  a  limit 
give  a  sequence  u. 

We  still  have  to  check  that  u  is  indeed  a  Sturmian  sequence.  Its  Rauzy  graph 
of  order  rik  is  a  subgraph  of  but  it  may  not  be  exactly  in  the  event 
where  the  path  associated  with  u  never  reaches  certain  branches  of  the  graph. 
In  this  case  the  sequence  would  be  ultimately  periodic,  which  implies  that  for  n 
large  enough,  all  graphs  Gn  have  a  loop  of  the  same  size  (equal  to  the  period). 
This  occurs  only  when  t  e  or  t  G  A^G^*"  (words  in  A*B'^  define  legal 

Sturmian  sequences,  for  instance  is  the  adic  representation  of  the  Fibonacci 
sequence).  □ 

We  can  now  turn  to  the  study  of  A(u).  Knowing  a  few  consecutive  terms  of  t, 
we  are  able,  using  the  corresponding  graphs,  to  evaluate  certain  values  of  £(n)  as 
a  function  of  [a^l,  |6fe|,  and  |cfc|.  In  some  cases,  we  can  prove  that  it  is  more  than 
ipn.  If  these  terms  occur  infinitely  many  times  in  t,  we  deduce  that  A(u)  >  (/?. 

Propositions.  If  t  contains  infinitely  many  occurrences  of  the  words  BC^A 
(with  m>0),  AC  A,  ACC,  CBCB,  CBCC,  BBCCB,  BBCCC  (i.e.  if  either 
t  contains  infinitely  many  occurrences  of  one  word  in  the  list,  or  if  t  contains 
BC^A  for  infinitely  many  values  ofm),  then  A(u)  > 

Proof.  We  shall  study  in  detail  only  the  case  of  the  word  BCA;  the  other  words 
are  dealt  with  similarly.  Suppose  that  tk+i  —  B,  tk+2  =  G,  and  tk+s  —  A;  let 
n  =  Uk,  a  =  Ok,  b  =  bk,  and  c  —  Ck-  Then  we  have  : 


i 

‘^k+i 

bk+i 

0 

n 

a 

b 

C 

1 

n  +  \ab\ 

a 

be 

ab 

2 

n  +  2\ab\ 

aba 

be 

ab 

3 

n  +  4|a6|  +  |c| 

aba 

be 

ababeab 

Note  that  all  paths  of  Gn^,^3  starting  at  the  pointed  vertex  begin  with 
bcabo.bcabab.  Thus  bcababcabab  is  a  prefix  of  u.  This  gives  the  beginning  of 
the  path  followed  by  u  in  the  graphs  Gn'  for  n'  >  n.  In  particular,  in  Gn+i 
(see  (3)),  the  shortest  prefix  going  through  every  edge  has  length  \bcab\,  hence 
£(n  +  2)  —  \bcab\;  in  Gn+ibj+i  (see  (8)),  this  shortest  prefix  has  length  \bcaba\, 
hence  i{n-\-\b\-\-2)  —  \bcaba\;  and  in  Gn-j-ja6|-i-i  (see  (4),  with  k  replaced  by 
k  +  1),  £(n+|a6|+2)  >  \bcababcab\.  Now  let  di  =  I{n+\b\i-2)  —  (f{nP\b\-j-2)  and 
d2  =  i{n-\-\ab\  +  2)  —  v?(n+ |a6|  +  2),  and  let  us  compute  di  +  {pd2,  recalling  that 
n  +  2  =  \abc\,  and  =  (f  1: 

di  +  pd2  =■  (7i.+  [6|  +  2)  —  </p(n+ [6|  +  2))  +  ip(^i(7i-\-\ab\-\-2)  —  (p{n-\-\oh\-\-2yj 

>  {\bcaba\  —  (p\bcab\)  +  p(\bcababcab\  —  (fi\bcaba\) 

>  (1  +  v?  —  \bcaba\  —  0 

This  shows  that  at  least  one  of  di  and  d2  has  to  be  non-negative,  i.e.  that 
i{n')  >  ipn'  for  n'  =  n  +  |6|  -f-  2  or  ==  n  +  \ab\  -f  2. 
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Similarly,  for  each  occurrence  in  t  of  a  word  in  the  list,  there  is  a  length 
n'  for  which  t{n')  >  ipn' .  If  there  are  infinitely  many  such  occurrences,  then 
A(u)  >  c/p.  .  □ 

Most  sequences  t  satisfy  the  conditions  of  Proposition  8;  the  only  words 
that  do  not  satisfy  them  are  the  elements  of  the  set  )  U  A  B  .  If 

t  G  A* B'-^ ,  then  u  is  a  morphic  image  of  the  Fibonacci  sequence  and  it  is  easy  to 
see  that  A(u)  =  c/p;  for  the  other  set  however,  the  method  of  Proposition  8  does 
not  seem  to  work. 

It  is  then  natural  to  study  the  simplest  examples  of  these  sequences,  for  which 
t  is  periodic  with  a  short  period.  Namely,  take  t  =  {CB'^Y ,  with  m  >2.  The 
recurrences  for  a^,  bki  and  Ck  can  be  solved;  taking  the  limit  of  (6^),  one  finds  in 
particular  that  the  associated  Sturmian  sequence,  z^,  is  the  fixed  point  of  the 
substitution  r  o  .9,  where  /(O)  =  01,  /(I)  =  0  (/  is  the  substitution  defining 
the  Fibonacci  word)  and  g{0)  =  01,  g{l)  =  1.  Computing  i  for  these  sequences, 
although  rather  technical,  is  not  very  difficult,  as  the  lengths  of  the  paths  in  the 
Rauzy  graphs  can  be  computed  from  the  lengths  of  bky  and  Ck-  If  only  X{zm) 
is  of  interest,  this  amounts  to  computing  the  eigenvectors  of  the  matrix  of  the 
substitution  o  p,  combining  them  in  several  ways,  and  taking  the  maximum. 
Proposition  9  summarizes  the  results  for  the  first  values  of  ui. 


Proposition  9.  The  sequences  z^,2<m<5,  yield  the  following  limits: 

3  +  \/3 


A(z2)  = 


1.57735  , 


A(z3)  =  g  —  ^  1-51949 
A(z4)  =  1.52660 


A(Z5)  ^ 


15 

415 +  3v^ 
280 


1.56852  . 


Among  these  four  examples,  the  sequence  zs  appears  to  give  the  lowest  value; 
it  is  indeed  possible  to  compute  explicit  formulas  for  all  X{zm)  and  to  prove  that 
they  are  increasing  for  >  3.  This  observation  suggests  that  A(z3)  could  be  the 
lowest  possible  value  for  Sturmian  sequences. 

To  prove  this,  we  proceed  as  in  Proposition  8,  loosening  the  researched  in¬ 
equality  by  replacing  ip  with  1.52. 


Proposition  10.  If  t  contains  infinitely  many  occurrences  of  one  of  the  words 
B^,  B^CB"^,  B^CB^CB"^  or  CB^CB'^CB,  then  A(u)  >  1.52  >  A(z3). 


The  only  sequences  t  satisfying  neither  Proposition  8  nor  Proposition  10  are 
elements  of  the  set  A*{CB^y  U  A*{CB^)^,  i.e.  the  corresponding  Sturmian 
sequences  are  morphic  images  of  Z2  or  Z3,  among  which  Z3  is  optimal  according 
to  Proposition  9  (changing  a  finite  prefix  of  t  does  not  change  the  value  of  A(u)). 
We  have  thus  finished  the  proof  of  Theorem  1  in  the  Sturmian  case. 
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It  should  be  noted  that  when  t  ^  then  A(u)  >  1.52  >  A(z3):  the 

spectrum  of  possible  values  for  A(u)  for  Sturmian  sequences  is  not  continuous. 
We  have  not  tried  to  find  what  the  next  attainable  value  is,  and  1.52  is  just  a 
rough  minoration. 

4  The  General  Case 

Let  us  now  turn  to  the  general  case.  As  noted  in  [1],  sequences  with  large  enough 
complexity  can  be  easily  eliminated. 

Proposition  11.  If  there  is  an  integer  uq  such  that  the  sequence  u  satisfies 
p{n  +  1)  —  p(n)  >  2  for  all  n  >  uq,  then  A(u)  >  2. 

Proof.  If  this  is  the  case,  then  p{n)  >  p{no)  +  2(n  —  no)  for  n  >  no,  hence  there 
is  a  constant  C  such  that  p(n)  >  2n  —  C  for  all  n.  According  to  Proposition  2, 
^(n)  >  p{n),  hence 

A(u)=limsupM>ii„,sup^>2  , 
n  n 

□ 

We  can  therefore  suppose  that  p{n  +  1)  -  p(n)  =  1  for  infinitely  many  n, 
which  implies  that  for  infinitely  many  n,  the  Rauzy  graphs  are  of  the  types 
of  Proposition  5,  at  least  if  the  sequence  is  recurrent  (non-recurrent  sequences 
have  slightly  different  graphs,  the  initial  vertex  being  connected  by  an  additional 
branch  to  the  main  part  of  the  graph,  but  they  can  be  handled  similarly).  As  for 
Sturmian  sequences,  we  can  define  the  sequences  n/c,  a/c,  bk,  and  Ck,  and  study 
the  possible  transitions.  There  are  infinitely  many  possible  transitions  (including 
A,  B,  and  C),  as  the  intermediate  graphs  can  be  very  complicated.  However,  in 
most  cases  we  will  obtain  a  sufficiently  large  minoration  for  A(u). 

The  graph  G'nfc+i  may  be  graph  (3)  or  graph  (4),  in  which  case  we  find 
the  same  transitions  A,  B,  and  C  as  with  Sturmian  words,  but  it  may  also  be 
graph  (2),  the  complete  line  graph  D{Gn^).  In  this  graph,  the  shortest  paths 
starting  from  the  pointed  vertex  and  going  through  every  edge  are  bccab  and 
babcc  (for  simplicity,  we  now  note  n  =  a  =  ak^  etc.)  hence  i{n+2)  >  |a|-l-2|6c|. 
What  happens  next  depends  on  the  respective  sizes  of  6  and  c.  If  b  is  shorter, 
we  get  the  following  graph  of  order  n \b\ 
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where  c  =  c'b',  |6'|  =  \b\.  There  are  then  two  possibilities  for  n  +  |?)|  +  1, 


zb’ 


In  both  cases,  a  path  starting  from  the  pointed  vertex  and  going  through  every 
edge  has  length  at  least  2|a6c|,  i.e.  £(n  +  |6|  +  2)  >  2|a6c|.  As  in  the  proof  of 
Proposition  8,  let  di  —  E{n  +  2)  —  and  ^2  —  16|  +  2)  —  ip{n+  |6|  +  2), 

and  let  us  compute  di  +  ipd2,  using  |a6c|  =  p{n  +  1)  >  n  +  2  : 

di  +  pd2  -  {^{n  +  2)  -  (p{n  +  2))  +  ip{i{n  +  \b\  +  2)  -  (f{n  +  \b\  +  2)) 

>  (|a|  +  2|6c|  -  p\abc\)  +  (p{2\abc\  -  ip\bcab\) 

==  (!+(/?-  p‘^)\abc\  +  \bc\  -  (p\b\  =  \bc\  -  ip\b\ 

As  |6|  <  |cl,  this  number  is  positive,  hence  also  one  of  di  and  c?2,  be.  i(n’)  > 
for  n'  =  +  2  orn'  =  n  +  |61  +  2.  If  this  transition  occurs  infinitely  often,  then 

f  (u)  >  we  can  therefore  assume  that  this  transition  does  not  occur  when  k  is 
large  enough. 

If  c  is  shorter  than  b  or  has  the  same  length,  several  subcases  are  possible, 
most  of  which  can  be  eliminated  with  the  same  kind  of  arguments.  The  only 
transitions  that  remain  are  those  where  the  loop  labelled  with  c  is  taken  a  fixed 
number  of  times  j  >  2  by  every  path,  with  (j  -  l)|c|  <  |6|.  We  eventually  get  at 
order  72  +  |61  the  graph 


which  is  essentially  the  same  as  graph  (5)  except  that  c  is  repeated  j  times.  This 
gives  rise  to  transitions  Aj  and  Bj  analogous  to  A  =  Ai  and  B  -  Bi. 

We  can  now  define  the  adic  representation  of  a  sequence  u  that  satisfies 
A(u)  <  p:  it  is  a  sequence  t  tkotko+itko+2  •  • .  on  the  infinite  alphabet  A'  = 
{Aj,Bj  \j  >  1}U{C},  where  tk  indicates  the  transition  between  and 

As  replacing  A  and  B  by  Aj  and  Bj  may  only  increase  the  values  of  i,  the  rest 
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of  the  proof  for  Sturmian  words  works  with  the  general  case  as  well,  and  we  can 
conclude  that  Theorem  1  is  true  for  any  recurrent  binary  sequence. 

As  noted  above,  the  case  of  non-recurrent  sequence  uses  graphs  with  a  slightly 
different  morphology,  but  does  not  cause  any  additional  problem.  The  case  of  an 
arbitrary  finite  alphabet  can  be  easily  reduced  to  the  binary  case  with  a  simple 
projection  argument,  and  this  completes  the  proof  of  Theorem  1. 
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Abstract.  Designing  escrow  encryption  schemes  is  an  area  of  much  re¬ 
cent  interest.  However,  the  basic  design  issues,  characterizations  and  dif¬ 
ficulties  of  escrow  systems  are  not  fully  understood  or  specified  yet.  This 
paper  demonstrates  that  in  public-key  based  escrow,  the  combination  of 
(1)  two  different  receivers  (intended  receiver  and  potentially  law  enforce¬ 
ment);  and  (2)  on-line  verified  compliance  assurance  by  the  sender  which 
ensures  that  law  enforcement  can  decrypt  ciphertext  upon  court  order,  is 
equivalent  to  a  “chosen  ciphertext  secure  public- key  system”  (i.e.,  one  se¬ 
cure  against  an  adversary  who  uses  the  decryption  oracle  before  trying  to 
decipher  a  target  ciphertext).  If  we  further  add  measures  to  ensure  that 
law  enforcement  is  given  access  to  messages  only  within  an  authorized 
context  and  law  enforcement  is  assured  to  comply  as  well  (i.e.,  it  can¬ 
not  frame  users),  then  the  escrow  system  is  equivalent  to  “non-malleable 
encryption  schemes”.  The  characterizations  provide  a  theoretical  under¬ 
pinning  for  escrow  encryption  and  also  lead  us  to  new  designs. 


1  Introduction 

The  intent  of  escrovi^  encryption  schemes  is  to  enable  strong  cryptography  for 
users  while  protecting  society  from  criminal  behavior.  Namely,  users  can  send  en¬ 
crypted  messages  while  enabling  law  enforcement  (when  and  only  when  allowed 
by  the  court)  to  read  their  clear  messages.  The  first  scheme  was  the  Escrow  En¬ 
cryption  Standard  (EES)  and  its  Clipper  implementation  [19],  after  which  many 
systems  have  been  suggested  world- wide  [10,  19,  25].  Governments,  industry  and 
international  organizations  are  all  investigating  escrow  encryption  solutions. 

Many  of  the  early  and  recent  designs  focused  on  various  specific  aspects  of 
escrow  encryption,  but  no  rigorous  investigations  of  the  technical  issues  have 
been  done.  One  of  the  basic  issues  that  the  initial  Clipper  implementation  [10] 
and  the  EES  gave  rise  to,  is  the  notion  of  “compliance  assurance  and  verification” 
implemented  through  the  use  of  a  LEAF  authentication  field  in  Clipper.  This  was 
only  based  on  an  intuitive  understanding  drawn  from  an  obvious  need,  and,  in 
fact,  due  to  design  errors  and  lack  of  understanding  of  requirements,  some  severe 
flaws  were  found  [6,  20].  Here  we  attempt  a  step  in  the  direction  of  theoretical 
understanding  of  escrow  systems. 

*  Research  performed  while  at  Sandia  National  Laboratories.  This  work  was  performed 
under  U.S.  Department  of  Energy  Contract  number  DE-AC04-76AL85000. 
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Our  Results:  , 

An  escrow  encryption  system  can  be  viewed  as  a  system  providing  private  mes¬ 
sages  for  a  regular  receiver  and  a  potential  additional  “shadow  receiver”  called 
law-enforcement.  We  concentrate  on  “public  key  based”  schemes,  i.e.  where  the 
sender  and  receiver  do  not  have  to  meet.  We  have  collected  available  requirements 
and  we  formally  model  “escrow  systems”  based  on  these  basic  requirements.  We 
concentrate  on  a  very  basic  property  of  “compliance  and  its  verification  in  our 
modeling.  This  property  assures  that  a  message  sent  is  made  available  for  future 
authorized  law  enforcement  access;  it  is  discussed  by  several  documents  and  pro¬ 
posed  systems.  As  an  example,  the  Law  Enforcement  Activation  Field  (LEAF) 
in  Clipper  has  the  purpose  of  enforcing  availability  of  “sufficient  information 
that  enables  the  shadow  receiver  to  read  the  messages  when  a  proper  escrow 
procedure  takes  place,  connecting  the  availability  to  the  decoding  availability. 
Another  example  motivating  us  to  investigate  compliance  issues  is  a  statement 
in  a  NIST  (The  US  National  Institute  of  Science  and  Technology)  document  [34] 
which  says:  “To  meet  these  criteria,  encryption  products  will  need  to  implement 
key  escrow  mechanisms  that  can  not  be  readily  altered  or  bypassed  so  as  to 

defeat  the  purpose  of  key  escrowing” .  .  «  » 

To  model  compliance  verification,  we  add  a  formal  entity  called  a  “gateway 
G  that  assures  that  messages  sent  into  the  systems  (from  a  sender  to  the  receiver 
and  the  potential  “law  enforcement”)  are  in  compliance;  G  is  less  obtrusive  than 
the  recently  suggested  “Trusted  Third  Party  entity”  [25].  We  then  ask:  Given  the 
escrow  encryption  system  models  with  the  basic  property  of  compliance,  what 
type  of  cryptosystems  and  security  notions  characterize  them?  Such  character¬ 
ization  helps  in  understanding  the  requirements  and  may  also  help  in  future 
designs.  It  may  potentially  allow  implementations  to  exploit  available  crypto¬ 
graphic  knowledge  and  prevent  flaws  in  future  system  designs. 

We  call  schemes  which  provide  the  sender’s  compliance  checking  capability 
compliance  verifiable  escrow  encryption  systems.  These  systems,  which  have  a 
seemingly  necessary  ingredient  required  for  full-fledged  escrow  encryption,  are 
shown  to  be  strongly  related  to  chosen  ciphertext  secure  encryption  public-key 
systems  which  were  first  introduce  by  Naor  and  Yung  [33]  and  further  developed 
in  various  works  [35,  9,  40,  28,  4,  21].  We  concentrate  on  systems  with  formal 
proof  of  security  and  we  prove  that  under  quite  a  broad  definition  of  the  respec¬ 
tive  systems  (avoiding  narrow  scenarios  and  limiting  resources,  concentrating  on 
the  principle  security  requirements),  the  following  holds: 

A  compliance  verifiable  escrow  encryption  exists  iff  a  chosen  ciphertext 

secure  cryptosystem  exists 

Furthermore,  if  we  require  more  from  the  escrow  system  and  also  assume 
that  the  system  has  to  limit  untrusted  law  enforcement  as  well,  then  we  get 
what  we  call  basic  escrow  encryption  systems.  These  systems  require  compliance 
verification  and  in  addition  they  ask  for  the  binding  of  a  message  to  a  proper 
limitation  of  context  (namely  time,  sender  and  receiver  identities).  Context  is 
required  to  be  checked  (by  an  authority-  escrow  agents)  before  messages  are 
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opened  to  law  enforcement  (as  advocated  in  the  recent  design  in  [27]  and  by 
formal  documents).  For  secure  basic  escrow  systems  we  show  that; 

A  basic  escrow  encryption  system  exists  iff  a  non-malleable  encryption 

system  exists 

Non-malleable  systems  in  essence  do  not  allow  an  attacker  to  modify  chosen 
ciphertexts  to  create  a  new  meaningful  ciphertext  and  were  introduced  by  Dolev, 
Dwork  and  Naor  [16]. 

Our  results  demonstrate  inherent  complexities  in  implementing  systems  like 
the  key  recovery  (by  the  US  government)  and  trusted  third  party  (by  some 
European  governments)  on  top  of  a  public  key  infrastructure.  They  show  that  the 
difficulties  of  assuring  compliance  in  such  systems  is  not  only  a  property  of  the 
current  ad-hoc  designs,  but  rather  they  are  inherent  to  any  system  attempting 
to  build  key  escrow  with  compliance  in  the  available  infrastructure. 

On  Reductions  between  Cryptosystems: 

We  note  that  in  “one-way  function  based  cryptography,”  a  characterization  was 
completed  and  its  various  primitives  (one-way  functions,  digital  signatures,  ran¬ 
dom  generators,  private  cipher  systems)  have  been  shown  to  be  equivalent  (in 
a  long  research  program  which  is  reviewed  in  [29]).  In  “public- key  encryption 
systems”  (our  subject)  the  picture  is  much  less  clear.  A  sufficient  condition  for 
a  secure  public  key  communication  (i.e.,  secrecy  without  the  parties  sharing  a 
key)  is  either  “trapdoor  function”  or  “key  exchange  protocol” .  These  imply  the 
existence  of  “one  way  function”  since  they  enable  an  authentication  protocol 
[23],  but  there  are  indications  that  one-way  functions  by  themselves,  cannot 
easily  imply  (based  on  black-box  reductions)  “public-key  cryptography”  (since 
such  a  construction  separates  NP  from  P)  [24].  Necessary  conditions  beyond  this 
are  not  known,  and,  therefore,  equivalence  among  various  public- key  notions  is 
mostly  open  and  intriguing. 

Another  issue  is  the  quality  of  the  cryptographic  reduction.  In  [29]  a  reduction 
is  quantified  by  the  amount  it  reduces  the  security  parameter  of  a  problem 
when  a  problem  is  reduced  to  another  one  (an  idea  attributed  to  L.  Levin).  Our 
reductions  are  high  quality  in  this  sense,  they  are  linear  preserving. 

Related  Work: 

Various  designs  have  been  suggested  concentrating  on  several  important  aspects 
and  crucial  stages  of  escrow  schemes.  Most  of  these  aspects  are  orthogonal  to  the 
issue  of  compliance  as  investigated  here.  In  [30]  the  issue  of  key  distribution  to 
trustees  was  discussed,  while  a  more  rigorous  approach  to  a  distribution  channel 
with  minimization  of  various  potential  exposures  was  given  in  [26].  In  [14]  tracing 
receivers  was  discussed  (this  is  criticized  in  [15,  20]),  and  in  [31]  the  distribution 
of  pseudorandom  functions  was  discussed.  The  issue  of  limiting  time  and  context 
of  escrow  was  discussed  in  [5,  27].  Also,  a  few  alternatives  to  escrow  based  on 
partial  key  has  been  put  forth  [37,  3,  2].  Opening  of  ciphertexts  based  on  small 
(message  by  message)  granularity  was  put  forth  in  [11].  Characterizing  universal 
escrow  trapdoor  using  public-key  systems  was  presented  in  [7].  Issues  for  systems 
design,  based  on  findings  of  initial  failures  were  discussed  in  [20]. 
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2  Compliance- Verification  Systems 

We  first  concentrate  on  a  system  with  minimal  requirements  for  escrow  encryp¬ 
tion,  assuming  honest  law  enforcement.  As  motivated  by  [34],  the  system  satisfies 
the  following  requirements:  (1)  compliance  verification:  the  messages  sent  are 
assured  to  be  open-able  by  law  enforcement  when  the  sender  actually  employs 
the  system  for  privacy;  and  (2)  “limiting  surveillance”:  as  said  in  [34]:  '‘in¬ 
formation  both  sent  and  received  by  the  user  can  be  decrypted  without  release 
of  keys  of  other  users.” 

We  note  that  we  allow  the  systems  in  our  definitions  to  use  all  available 
resources  such  as  interaction  and  perhaps  inefficient  constructions  as  long  as 
they  are  polynomial.  We  are  interested  in  relations  and  we  do  not  necessarily 
limit  ourselves  to  restricted  models. 

Remark:  We  note  that  the  sender  may  use  another  mechanism  to  encrypt 
the  message  (pre-encryption)  or  employ  a  covert  channel.  Our  definition  does  not 
attempt  to  prevent  such  transmissions;  all  we  argue  about  is  that  if  a  certain 
message  is  generated  by  the  system  and  is  sent  via  the  gateway  to  the  receiver-- 
it  should  be  the  case  that  law  enforcement  gets  this  message  when  needed. 

The  parties:  There  are  four  parties  which  are  polynomial-time  and  each  has 
its  respective  key:  S  is  the  sender  and  G  is  the  gateway,  through  which  messages 
are  passed,  R  is  the  receiver,  and  L  is  law  enforcement.  The  sender  and  gateway 
may  be  active  at  message  sending  and  are  probabilistic  algorithms.  (We  explain 
the  motivation/source  of  the  gateway  below). 

Basic  properties  and  definition 

-  Based  on  the  above  two  requirements,  in  order  to  satisfy  the  compliance 
verification  the  gateway  G  is  introduced  here  which  does  not  allow  a  cipher- 
text  that  does  not  pass  the  verification  (of  compliance)  to  be  opened  (or 
received)  by  the  receiver.  Such  assurance  seems  to  be  a  minimal  require¬ 
ment  in  a  mandatory  escrow  process.  Physically,  this  gateway  may  reside  at 
the  sender  module,  receiver  module  (as  the  LEAF  checker  in  Clipper),  or 
anywhere  on  the  communication  channel  (the  network  router,  the  firewall, 
etc.).  The  gateway  is  a  checking  function  of  the  sender  and  is  similar  but  less 
involved  than  the  recently  suggested  "trusted  third  party”  [25]  endorsed  by 
a  number  of  European  government  and  financial  institutes. 

-  To  satisfy  the  minimal  surveillance  requirement,  the  law  enforcement  key 
must  be  different  than  the  receiver’s  key. 

Definition!.  Compliance  verifiable  escrow  encryption  system  (CV- 
EES):  Let  k  be  the  security  parameter  for  an  encryption  system  which  for  any 
Law  enforcement  (L)  with  a  randomly  chosen  public  key  ex,  (and  corresponding 
private  key  di),  for  any  public  key  br  (chosen  at  random)  and  corresponding 
private  keys  dR  of  the  Receiver  (R),  and  a  verification  key  vg  for  a  compliance 
Gateway  (G),  for  any  Sender  (5),  the  following  holds. 
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Let  a  be  the  encryption  of  a  message  m  generated  by  S,  namely  S{eR,ei,'m)  = 
O',  then  there  is  a  protocol  between  S  and  G,  and  then  R  acts  on  a  (to  get  the 
message),  L  may  apply  to  ot  later.  We  are  assure  that: 

Certification  of  Compliance:  for  any  a,  let  G{vgi  a)  the  result  of  a  protocol 
between  G  and  S  computed  by  G,  then  G{vG,Oi)  =  1  implies  that  there  exists 
an  77?  such  that:  R{dR,a)  ■=  m  and  L(d£,a)  =  m  with  probability.  1  -  for 
any  constant  d,  for  parameter  k  large  enough. 

Security:  The  system  is  polynomially  secure  [22],  namely  for  any  two  messages 
777.0  ,  7771  computed  by  a  message  finder,  for  any  ciphertext  u  that  encrypts  one  of 
the  message  7??;,  (6  G  {0, 1}  chosen  at  random),  for  any  message  distinguisher  that 
is  given  {cr,  cl.vg ,'iTiOymi,  a)  returns  b'  =  {0,1},  then  h  —  b'  with  probability 
less  than  ^  for  any  constant  d,  for  k  large  enough. 

Note  that  we  can  have  a  number  of  variations  that  do  not  change  the  system 
in  a  fundamental  way:  we  can  assume  that  the  ciphertext  generated  is  performed 
interactively  with  the  gateway;  also,  the  order  of  choice  and  publication  of  the 
keys  does  not  matter  as  a  receiver  cannot  help  itself  by  using  in  its  key 
generation  so  cr  is  actually  drawn  at  random  to  be  secure. 


2.1  Chosen  ciphertext  security 

Let  us  recall  the  definition  of  chosen-ciphertext  secure  systems  [33]. 


Definition  2.  Chosen  ciphertext  secure  encryption  system  (ccs  sys¬ 
tem):  Let  k  be  the  security  parameter  for  a  public  key  encryption  system 

which  generates  public/private  key  pair  (e,d)  for  each  user  of  the  system.  The 
adversary  attacking  a  user  (A  CC- attacker)  is  a  sender  who  is  allowed  the  fol¬ 
lowing  attack:  It  generates  a  history  tape  h  from  l*,e  and  input/output  pairs 
from  (poly  in  k)  ciphertext  queries  it  provides  adaptively  to  a  decryption  oracle 
which  has  d.  Then,  the  following  holds: 

Security:  Two  messages  mo,  mi  from  the  message  space  are  generated  from  a 
probabilistic  polynomial  time  called  a  message  finder  on  input  and  auxiliary 
input  tape  which  may  include  h  and  e  and  other  public  information.  Let  a  be  the 
encryption  of  mb  with  e  for  some  randomly  chosen  bit  b.  Lastly,  a  message  dis¬ 
tinguisher  given  (e,  mo,  mi,  h,  a)  returns  6^  =  {0, 1}.  A  system  is  secure  against 
chosen  ciphertext  if  it  is  polynomially  secure  after  the  attack  namely,  for  any 
CC-attacker,  for  any  message  finder,  for  any  message  distinguisher,  then  b  =  b' 
with  probability  less  than  ^  ^  for  any  constant  d  and  k  large  enough. 

Remark:  the  definition  above  assumed  non- adaptive  attacker  in  the  sense 
that  the  target  ciphertext  was  not  available  to  it  when  producing  h,  we  may  also 
allow  adaptive  attacker  (that  gets  to  see  the  challenge  first,  but  is  not  allowed 
to  query  the  oracle  on  it). 
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2.2  Equivalence  of  the  systems 

Next  we  compare  the  systems  above:  the  first,  motivated  by  requirements  of  an 
escrow  encryption  environment,  and  the  second  which  assures  level  of  security. 
We  prove  the  following: 

Theorems.  The  foUoioing  are  equivalent:  (1)  Existence  of  compliance  verifi.able 
escrow  system,  and  (2)  Existence  of  chosen  ciphertext  secure  encryption  system. 

To  prove  the  theorem  we  will  show  reductions  in  the  next  two  Lemmas: 

Lemma  4.  If  there  exists  a  compliance  verifiable  escroiv  system  CV-EES  then 
there  exists  chosen  ciphertext  secure  (CCS)  encryption  system.. 

Proof.  (Sketch)  We  assume  that  there  exists  a  CV-EES  and  build  a  CCS 
system.  Let  G  be  the  gateway  and  vq  be  the  verification  key  for  the  CV-EES. 
Let  cr^cl  be  the  public  keys  and  cIl  be  the  law  enforcement  key  corresponding 
to  CL  ioY  the  CV-EES. 

In  the  following  we  will  use  a  “tinkering  argument”  that  will  move  keys  and 
components  around  to  have  a  public-key  system  based  solely  on  the  components 
available  to  us  from  the  CV-EES  system.  We  demonstrate  that  the  following  is  a 
CCS  system  in  a  complete  public-key  environment  where  every  participant  has 
a  key  (as  in  Rackoff  and  Simon  [35]).  The  following  is  done 

-  Each  user  u  publishes  a  public  key  as  a  receiver  which  is  drawn  from  the 
family  of  receivers’  public  key; 

-  in  addition  it  publishes  a  sender  key  which  is  from  the  family  of  Law  en¬ 
forcement  keys  ef. 

Let  V  be  the  sender  and  U  the  receiver,  we  define  them  as  following: 
Encryption  of  m:  a  =  S{eu ,  ,  m) 

Decryption  of  u:  If  G{vg,  »)  =  1  then  return  m  =  L{duy  o;)  else  return  NULL. 

First  note  that  the  system  is  polynomially  secure,  this  is  derived  from  the 
security  definition  of  the  CV-EES  system.  To  prove  that  this  is  secure  against 
a  chosen  ciphertext  attack  we  use  the  following  argument.  When  G  returns  1 
it  means  that  the  two  decryptions  (under  du  and  under  d^)  retrieves  the  same 
message  with  overwhelming  probability.  Thus,  in  a  similar  argument  to  [35], 
if  G{vGya)  -  1  then  the  sender  must  have  known  the  input  which  generated 
ciphertext  a  (by  knowing  and  applying  to  retrieve  the  message.  Observe 
that  this  key  is  the  sender  private  key  drawn  from  the  family  of  law  enforcement 
keys  which  is  corresponding  to  .  Since  “the  sender”  already  knew  the  value 
that  is  encrypted  we  are  sure  that  revealing  it  after  the  check  “the  sender”  won’t 
learn  anything  new  since  “the  sender”  already  must  have  known  this  information. 
Hence,  the  attacker  being  the  sender  was  reduced  to  a  “known  plaintext  attack” 
which  is  taken  care  of  by  the  property  of  polynomial  security.  In  fact  we  can 
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show  is  that  by  providing  the  ciphertext  queries  and  getting  a  corresponding 
cleartext  answer  in  a  CC-attack  producing  history  the  attacker  has  no  more 
power  than  the  (message  only)  attacker  that  produces  by  itself  (without  the  help 
of  the  oracle)  a  cleartext  message  and  then  produces  its  ciphertext  and  produce 
a  history  k'  of  ordered  ciphertexts  and  their  corresponding  messages. 

The  above  reduction  is  direct  (using  the  same  keys  used  in  the  original  sys¬ 
tem)  and  thus  one  does  not  lose  in  the  size  of  the  security  parameter  when 
translating  the  CV-EES  to  the  CCS  cryptosystem.  This  implies  that  a  success 
ratio  in  breaking  the  CCS  cryptosystem  implies  the  same  ration  for  the  escrow 
related  scheme  (CV-EES)  which  means  a  linear  preserving  reduction. 

Lemma  5.  If  there  exists  a  chosen  ciphertext  secure  (CCS)  encryption  system 
then  there  exists  compliance  verifiable  escrow  system. 

Proof.  (Sketch)  First,  we  have  an  encryption  scheme  which  is  polynomially 
secure  (by  definition  of  CCS-cryptosystem);  in  fact  all  we  need  is  a  secure  en¬ 
cryption  for  proving  the  lemma.  So  L  can  publish  a  public  key  ex,  in  such  a 
system,  and  R  publishes  another  public  key  cr  in  this  system.  Then  note  that 
since  we  have  chosen  ciphertext  secure  encryption  system  we  have  a  one-way 
function  (we  can  use  the  encryption  function  for  authentication  protocol  among 
two  parties,  thus  by  [23]  one-way  function  exist). 

We  continue  by  a  simple  version  of  a  construction  of  [33]  to  construct  the  CV- 
EES.  To  send  a  message  m,  a  sender  first  generates  two  encryptions  of  m  one 
for  the  receiver  under  cr  and  one  for  law  enforcement  under  ex,.  The  verification 
algorithm  of  G  is  done  by  a  zero-knowledge  proof  of  knowledge  of  the  fact  that 
“The  sender  (i.e.,  prover)  knows  a  unique  message  m  such  that  the  two  ciphertext 
are  encryptions  of  it  under  their  respective  keys”.  This  is  an  NP  statement  and 
can  be  proven  to  G  interactively  in  a  zero-knowledge  fashion  by  the  sender  (that 
knows  the  preimages)  using  the  availability  of  one-way  functions.  This  proof 
assures  that  the  message  opened  by  the  receiver  using  his  public  key  is  the 
same  message  available  to  law  enforcement  if  they  wish  to  open  it  using  their 
key,  thus  G  can  allow  the  two  ciphertexts  to  be  transmitted  together  over  the 
communication  line.  (In  the  next  version  we  will  formally  recall  the  definition  of 
proof  of  knowledge  [17,  38,  1]  and  use  it  to  show  that  the  system  has  the  required 
properties  and  that  with  very  high  probability  both  security  and  certification  of 
compliance  hold). 

Using  amplification  of  one-way  functions  [39]  we  can  have  the  probability 
of  extracting  any  computational  advantage  in  the  CV-EES  system  based  on 
the  zero  knowledge  proof,  inverse  exponential  in  the  security  parameter  for  any 
polynomial-time  computation.  Thus,  breaking  the  system  based  on  breaking  the 
ZK  proofs  adds  a  negligible  value  to  the  time-success  ratio.  Now  observe  that  the 
reduction  just  uses  two  CCS  encryption  systems,  and  the  above  zero-knowledge 
proof  (which  has  the  inverse  exponential  success  probability).  Breaking  the  CV- 
EES  encryption  means  that  in  most  of  the  time  (at  least  1/2  of  the  cases)  we 
break  one  of  them.  Therefore,  this  reduction  is  linear  preserving. 
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3  Basic  escrow  encryption  systems  and  non-malleability 

CV-EES  systems  are  not  sufficient  to  protect  individuals’  privacy  rights  from 
unlawful  search  and  seizure  as  they  impose  no  compliance  restriction  on  opening 
of  ciphertext  by  law  enforcement.  For  example,  it  was  shown  that  with  Clipper 
it  is  possible  to  modify  the  ciphertext  so  that  it  appears  the  ciphertext  was 
generated  by  (or  for)  a  Clipper  chip  different  from  the  actual  participants  [20]. 
Let  us  review  what  we  want  from  a  “basic  escrow  encryption” . 

-  First,  we  want  “basic  escrow  encryption  systems”  to  assure  compliance  of 
senders  (as  in  CV-EES),  and  to  be  secure  (as  in  CV-EES). 

-  In  addition,  it  has  been  concluded  by  many  that  we  need  to  have  some  con¬ 
text  associated  with  a  ciphertext  which  determines  if  law  enforcement  has 
the  right  to  open  that  message.  Then,  an  authorization  body  (judge,  escrow 
agents,  etc.)  can  use  this  context  to  determine  whether  to  allow  law  enforce¬ 
ment  to  open  or  not  to  open  a  ciphertext  (This  is  context-limited  escrow). 
This  is  motivated  by  various  designs  [6,  20,  30,  8,  27,  5]  and  primarily  by 
the  correspondences  on  Clipper  [18,  10,  19].  The  ciphertext  context  includes 
the  sender  and  receiver  identity  since,  formally,  “Law  enforcement  agencies 
require  (1)  information  from  the  service  provider  to  verify  the  association  of 
the  intercepted  communications  with  the  intercept  subject,  ...”  [18].  This 
seems  a  reasonable  minimal  requirement.  Note  that  “context  limitation”  is 
a  double-edge  sword.  Namely,  the  sender  who  knows  that  law  enforcement  is 
allowed  to  escrow  based  on  restricted  context,  can  attach  “wrong  context” 
to  evade  legal  escrowing.  Thus,  the  sender’s  compliance  has  to  be  revised 
and  to  include  also  compliance  with  “a  correct  context”  which  is  assured  by 
extended  compliance  assurance  which  includes  (context  certification). 

-  Next,  from  a  security  point  of  view,  we  would  need  to  be  able  to  identify 
a  sender  with  the  message  and  not  enable  law  enforcement  to  modify  the 
sender’s  ID  nor  the  other  content  and  context  (opening  of  messages  is  allowed 
only  within  a  context).  This  makes  the  system  spoofing-free  (with  respect 
to  law  enforcement  that  tries  to  modify  messages  or  fabricate  ones  based 
on  past  opened  messages  and  even  when  it  can  control  some  of  the  earlier 
messages  sent  in  a  conversation). 

The  notion  of  spoofing- freeness  looked  to  us  related  to  the  one  of  non¬ 
malleability  (defined  in  [16]).  The  later  helped  the  formalization  of  the  above 
requirements  as  following: 

Definition 6.  Basic  Escrow  encryption  system  (B-EES):  Let  k  be  the 

security  parameter  for  an  encryption  system  which  for  any  Law  enforcement  (L) 
with  a  randomly  chosen  public  key  cl  (and  corresponding  private  key  dj,),  for 
any  verification  key  vq  for  a  compliance  Gateway  (G),  and  a  legal  authority  J 
with  authorization  key  aj  =  aj{eL,dL),  for  any  randomly  chosen  public  key 
e/?  and  private  keys  cIr  of  the  Receiver  (R)  (each  of  the  keys  drawn  from  a 
corresponding  key  family  with  parameter  k),  then,  for  any  Sender  (S): 
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Let  be  the  context  of  a  message  which  at  minimum  includes  the  identity  of 
the  sender  and  receiver  KTr-  Let  a  be  the  encryption  of  a  message  m  gener¬ 
ated  by  S  using  receiver’s  key  e/j,  namely  S{eR,eL,  HT.m)  =  a  then: 

(1)  Compliance  and  Context  Certification  and  Correctness:  for  any  a, 
let  G{vG,(y)  the  result  of  a  protocol  between  G  and  S  computed  by  G,  then 
G{vcu<y)  —  1  implies  that  with  probability  I  —  for  any  d,  for  k  large  enough: 

-  there  exists  an  m  such  that:  R(dR^a)  =  m,K,T,  and 

-  the  result  of  J{aj,  a)  =  context,  key  and  context  =  kt  and  Lid^,  a,  key, = 
m,  KT.  (We  may  assume  that  the  context  is  part  of  the  message). 

(2)  Context- Limited  Escrow:  For  any  ciphertext  a  and  an  authorized  con¬ 
text  KT,  Let  J{aj,a)  =  context,  key.  If  context  =  kt  then  L  is  activated  and 
L{dL,  o,  key)  =  m,  kt. 

(3)  Spoofing-freeness:  We  define  poly-time  adversary  A  which  may  try 

to  produce  a  message  by  modifying  another  message  according  to  some  poly¬ 
time  relation  REL  {REL  different  from  the  identity  relation),  thus  spoofing  the 
system  (and  generating  a  message  out  of  context  or  with  different  content  that 
may  be  opened  by  the  judge  and  in  effect  will  frame  the  user);  formally: 

-  The  adversary  A  first  generates  a  history  tape  hi  from  ,e,eL,dL,eR  and 
input/output  pairs  from  (poly  in  k)  queries  it  provides  to  the  authorizing 
authority  with  authorized  contexts;  For  any  ciphertext  ai  and  an  authorized 
context  KTi,  Let  J(aj,ai)  =  contexti,  keyi.  If  contexti  —  KTi  then  L  is 
activated  and  L{dL,  ai,  keyi)  =  rn-i,KTi.  The  record  {ai,  KTi,  keyi,  m-i)  is  put 
on  the  history  tape. 

-  Then,  A  produces  a  distribution  M  on  messages  (and  contexts). 

-  Then  A  receives  the  challenge  ciphertext  a  Er  S{e,eL,  KT,m)  for  m  Gh  M. 
and  some  knowledge  about  the  message  (e.g.,  its  context)  called  hint{in) 
which  is  polynomial  time  computable  from  m. 

-  A  again  generates  a  history  tape  /12  from  ,e,eL,dL,eR  and  input/output 
pairs  from  (poly  in  k)  queries  all  different  from  a  that  it  provides  to  the 
authorizing  authority  with  authorized  contexts;  For  any  ciphertext  aj  and 
an  authorized  context  KTj,  Let  J{aj,aj)  =  context j ,  key j .  context j  —  ktj 
then  L  is  activated  and  L(dL,0'j,  keyj)  =  mj,KTj. 

The  record  {aj,  ktj,  keyj ,  m.j)  is  put  on  the  history  tape. 

-  A  now  produces  polynomially  many  ciphertexts  fi  such  that  fi  is  an  encryp¬ 
tion  of  (Si  -  Then  A  succeeds  if  REL{m,j3i)  holds  for  some  i. 

The  system  is  called  spoofing-free  if  for  any  polynomial  v4  for  any  polynomial 
modification  relation  REL  the  probability  of  success  is  smaller  than  1/k^  for  any 
constant  d,  for  k  large  enough.  This  concludes  the  definition. 

We  note  that  spoofing-freeness  is  modeled  after  non-malleability  and  implies 
polynomial  security.  In  fact,  we  can  show  more  strongly  (proof  omitted)  that 
following  the  proof  strategy  of  the  Theorem  3  gives: 
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Theorem  7.  The  following  are  equivalent:  (1)  Existence  of  basic  escrow  encryp¬ 
tion  systems,  and  (2)  Existence  of  non-malleable  encryption  systems. 

Designs: 

The  characterizations  have  led  us  to  a  number  of  designs  based  on  secure 
public  key  systems  (and  their  relaxations).  The  designs  introduce  various  ways 
to  implement  the  compliance  verifying  gateway. 

Based  on  private  key  system  we  can  consider  a  server-based  key  distribution 
system  (where  users  do  not  meet  but  each  user  shares  a  permanent  key  with 
a  server).  We  can  adapt  our  results  and  conclude  that  by  augmenting  such  a 
system  we  can  have  an  escrow  system  in  this  model.  What  we  need  is  the  notion 
of  “publicly  certified  key  distribution”  where  each  key  given  to  a  user  has  also 
a  publicly  announced  version  which  is  encrypted  or  one-way  processed  by  the 
trusted  server.  Now,  each  key  distribution  to  a  pair  of  users  can  be  on-line 
verified  by  a  gateway  G  for  compliance.  Unlike  [25],  this  design  needs  only  one 
way  functions.  We  get  (proof  omitted): 

Theorems.  Based  on  a  trusted  server  and  the  existence  of  a  one-way  function 
(only),  there  exists  a  basic  escrow  system. 
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Abstract.  The  model  of  Non- Interactive  Zero- Know  ledge  allows  to  ob¬ 
tain  minimal  interaction  between  prover  and  verifier  in  a  zero-knowledge 
proof  if  a  public  random  string  is  available  to  both  parties.  In  this  pa¬ 
per  we  investigate  upper  bounds  for  the  length  of  the  random  string  for 
proving  one  and  many  statements,  obtaining  the  following  results: 

-  We  show  how  to  prove  in  non-interactive  perfect  zero-knowledge  any 
polynomial  number  of  statements  using  a  random  string  of  fixed 
length,  that  is,  not  depending  on  the  number  of  statements.  Pre¬ 
viously,  such  a  result  was  known  only  in  the  case  of  computational 
zero- knowledge. 

-  Under  the  quadratic  residuosity  assumption,  we  show  how  to  prove 
any  NP  statement  in  non-interactive  zero-knowledge  on  a  random 
string  of  length  &{nk)^  where  n  is  the  size  of  the  statement  and  k  is 
the  security  parameter,  which  improves  the  previous  best  construc¬ 
tion  by  a  factor  of  &[k). 

1  Introduction 

Zero-knowledge  proofs  [19,  17]  require  quite  a  rich  scenario  in  terms  of  resources 
needed  and  much  effort  has  been  devoted  to  presenting  alternative  poorer  set¬ 
tings  in  which  zero-knowledge  proofs  were  possible. 

In  [5,  6,  12],  the  shared-string  model  for  non-interactive  zero-knowledge  was 
put  forward.  Here,  the  prover  and  the  verifier  share  a  random  string  and  the 
mechanism  of  the  proof  is  mono-directional:  the  prover  sends  one  message  to  the 
verifier.  Non-interactive  zero-knowledge  proofs  have  found  several  applications  in 
Cryptography  (most  notably  the  construction  of  cryptosystems  secure  against 
chosen-cyphertext  attacks  [24])  and  can  be  employed  in  any  setting  in  which 
communication  is  a  precious  and  scarce  resource.  Thus,  the  shared-string  model 
trades  the  need  for  interaction  with  the  need  for  shared  randomness.  Since  non- 
interactive  zero-knowledge  proofs  from  scratch  can  be  obtained  only  for  BPP 
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languages  [18],  the  shared-string  model  provides  a  minimal  enough  setting  for 
non-interactive  zero-knowledge . 

Randomness  has  played  a  major  role  in  several  theoretical  and  applied  fields 
of  Computer  Science.  Several  are  the  examples  of  computational  tasks  which  are 
impossible  to  execute  deterministically  or  whose  efficiency  is  greatly  enhanced 
if  a  source  of  random  bits  is  available.  Unfortunately,  good  random  sources  are 
difficult  to  find  and  this  has  motivated  the  study  of  the  minimal  amount  of  ran¬ 
domness  needed  for  certain  tasks  (e.g.,  computing  the  sum  in  a  secure  way  [7]), 
of  techniques  for  reducing  the  number  of  random  bits  used  by  probabilistic  algo¬ 
rithms  (see  for  instance  [20])  and  the  construction  of  pseudorandom  generator 
specific  for  certain  computational  tasks:  pseudorandom  generator  for  constant- 
depth  circuits  [1,  25],  space  bounded  computation  [26,  27]  and  network  compu¬ 
tation  [23]  have  been  presented.  The  randomness  in  interactive  proof  systems 
has  been  studied  in  [2]  and  [3]. 

In  this  paper  we  consider  the  shared  string  model  for  non-interactive  zero 
knowledge  of  [5,  6]  and  study  the  amount  of  shared  randomness  needed  for  zero- 
knowledge  proofs. 

Perfect  zero-knowledge  on  a  fixed  random  string.  The  first  problem  we 
investigate  is  the  possibility  of  proving  many  statements  using  a  random  string 
of  fixed  length,  i.e.,  not  depending  on  the  number  of  statements.  This  problem 
has  found  early  solutions  for  the  case  of  computational  zero-knowledge  in  [5], 
assuming  the  intractability  of  quadratic  residuosity,  and,  later,  in  [15],  assuming 
the  existence  of  certified  one-way  permutations.  The  certification  requirement 
for  one-way  permutations  was  later  removed  in  [4].  In  [15,  13]  the  case  of  many 
provers  was  solved.  Unfortunately,  these  constructions  do  not  preserve  perfect 
zero  knowledge  and  thus  cannot  be  used  in  our  context.  Before  the  current  paper, 
no  indication  had  been  given  that  this  problem  might  have  a  positive  solution 
in  the  case  of  perfect  zero-knowledge.  The  state  of  this  problem  was  particularly 
unclear  also  because  not  many  non-interactive  perfect  zero-knowledge  protocols 
have  been  found  in  the  literature  (see  [11]). 

Our  results.  We  show  how  to  prove  many  statements  in  non-interactive  perfect 
zero-knowledge  using  a  fixed  random  string.  First  we  give  a  protocol  for  the  lan¬ 
guage  of  quadratic  non  residuosity.  Then  we  identify  a  general  class  of  languages, 
called  Simulator-Rankable  languages,  for  which  we  give  a  protocol.  Finally,  we 
show  that  all  languages  known  having  a  non-interactive  perfect  zero-knowledge 
proof  system  are  Simulator-Rankable. 

Non-iiiteractive  zero- knowledge  for  all  NP  on  a  short  random  string. 

Another  problem  we  investigate  is  the  possibility  of  proving  any  NP  statement 
using  a  random  string  of  short  length.  Many  non-interactive  zero-knowledge 
proof  systems  for  NP-complete  languages  have  been  given  in  the  literature,  mo¬ 
tivated  by  attempts  both  of  reducing  the  complexity  assumption  necessary  and 
of  increasing  the  efficiency  of  the  proof  system.  The  first  proof  system  for  all  NP 
was  given  in  [6],  under  a  specific  number-theoretic  assumption,  and  used  a  ran¬ 
dom  string  of  length  G{kn^),  where  by  k  we  denote  the  security  parameter,  and 
by  n  the  size  of  the  input.  The  proof  system  in  [5,  12]  reduced  the  assumption  to 


718 


the  intractability  of  deciding  quadratic  residuosity  modulo  composite  integers, 
and  used  a  string  of  length  e(kn^).  The  proof  system  in  [15]  reduced  the  assump¬ 
tion  to  the  intractability  of  inverting  one-way  permutations,  and  used  a  string 
of  length  0{kn^'^).  Under  the  same  assumption,  [21]  and  [22]  obtained  proof 
systems  using  a  string  of  length  &{k^nlogn)  and  &{k‘^n),  respectively.  Under 
the  quadratic  residuosity  assumption,  [9]  and  [8]  obtained  proof  systems  using 
a  random  string  of  length  0(k^n).  As  a  result,  the  best  known  proof  system  for 
all  NP  before  this  paper  uses  a  random  string  of  length  0{k’^n). 

Our  result.  Under  the  quadratic  residuosity  assumption,  we  show  how  to  prove 
any  NP  statement  in  non-interactive  zero-knowledge  using  a  random  string  of 
length  0(k7i),  thus  improving  the  previous  best  result  by  a  factor  of  ©{k). 
Lower  bounding  the  length  of  the  random  string.  In  order  to  best  esti¬ 
mate  the  efficiency  of  our  proof  systems,  we  have  also  looked  at  the  question 
of  finding  lower  bounds  on  the  length  of  the  random  string  necessary  to  obtain 
a  non-interactive  zero-knowledge  proof.  Previously,  a  result  in  [18]  showed  that 
non-interactive  (computational  or  perfect)  zero-knowledge  proofs  without  the 
random  string  are  possible  only  for  languages  in  BPP. 

Our  result.  We  can  show  that  non-interactive  (computational  or  perfect)  zero- 
knowledge  proofs  on  a  random  string  of  length  less  than  max(Ar,  clogn),  for  any 
constant  c,  can  be  given  only  for  languages  in  BPP. 

Organization  of  the  paper.  In  Section  2,  we  review  the  definitions  for  non- 
interactive  zero-knowledge  proofs.  In  Section  3,  we  present  our  results  on  proving 
multiple  non-interactive  perfect  zero-knowledge  on  a  fixed  random  string.  In 
Section  4,  we  present  our  result  on  proving  any  NP  statement  in  non-interactive 
zero-knowledge  on  a  short  random  string.  Formal  proofs  and  descriptions  of  some 
protocols  are  omitted  from  this  extended  abstract  for  lack  of  space.  For  the  same 
reason,  we  follow  the  notation  of  [5]  without  explictly  repeating  it  and  advise 
the  reader  to  refer  to  [28]  or  [5]  for  the  necessary  number-theoretic  background. 

2  Non-interactive  Zero-Knowledge 

We  review  the  definition  of  non-interactive  zero-knowledge  proof  systems  of  [5] , 
referring  the  reader  to  the  original  paper  for  motivations  and  discussions.  We 
start  with  the  definition  of  non-interactive  proof  systems. 

Definition  1.  Let  P  a  probabilistic  Turing  machine  and  V  a  deterministic  Tur¬ 
ing  machine  that  runs  in  time  polynomial  in  the  length  of  its  first  input.  We  say 
that  (P,V)  is  a  Non-interactive  Proof  System  with  security  parameter  Ar  >  1  for 
the  language  L  if  there  exists  a  constant  c  such  that  the  following  hold: 

1.  Completeness.  V.t  G  L,  \x\  —  n,  and  for  all  sufficiently  large  n, 

Pr(a-f-{0,  Proof  P(cr,  a:)  :  U(cr,  x,  Proof)  =  1)  >  1  -  2"^’. 

2.  Soundness.  Vx  ^  L,  |x|  =  n,  for  all  Turing  machines  P',  and  for  all  sufficiently 
large  n, 

Pr{a^{0, Proof  ^P'(a,x):  V  Proof)  =  1)  <  2-*. 
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We  will  call  the  random  string  a,  input  to  both  P  and  V,  the  reference  string. 
Now  we  recall  the  definitions  of  non-inter  active  computational  and  perfect  zero- 
knowledge  proof  systems.  We  will  denote  by  View{n,  x)  the  probability  space 
;r)  =  {c7<-{0,  ;  Proof  ^  P{a,  x)  :  {a,  Proof)} ,  where  c  is  a  constant. 


Definition 2.  Let  (P,V)  be  a  non-interactive  proof  system  for  the  language  L. 
We  say  that  (P,V)  is  Computational  Zero- Knowledge  if  there  exists  an  efficient 
algorithm  5*,  called  the  Simulator  such  that  'ix  E  L,  ^x\  ~  n,  for  all  efficient 
non-uniform  (distinguishing)  algorithms  Dn,'dd>  0,  and  all  sufficiently  large  ?7., 


Pr(s^View[n,x)  :  Dn[s)  =  1)  —  Pr(s :  Dn[s)  =  1) 


<  n 


d 


Definitions.  Let  (P,V)  be  a  non-interactive  proof  system  for  the  language  L. 
We  say  that  (P,V)  is  Perfect  Zero- Knowledge  if  there  exists  an  efficient  algorithm 
5,  called  the  Simulator  such  that  ^x  G  L,  \x\  =  n,  and  all  sufficiently  large  n, 
the  two  probability  spaces  5(1^,  a^)  and  View{n^  x)  are  equal. 


3  Perfect  zero-knowledge  on  a  fixed  random  string 

In  this  section  we  show  how  to  prove  any  polynomial  number  of  statements  in 
non-inter  active  perfect  zero-knowledge  using  a  reference  string  of  fixed  length.  In 
Subsection  3.1  we  present  our  technique  with  respect  to  the  language  of  quadratic 
non  residuosity.  In  Subsection  3.2  we  give  a  result  that  will  be  useful  when  prov¬ 
ing  this  result  for  a  more  general  class  of  languages:  a  transformation  between 
any  non-interactive  zero-knowledge  proof  system  with  expected  polynomial  time 
simulator  to  one  with  strict  polynomial  time  simulator.  In  Subsection  3.3  we 
describe  a  protocol  that  applies  to  a  more  general  class  of  languages,  that  we 
call  Simulator- Rankable  languages. 

Some  simplifications.  For  simplicity,  in  our  protocol  for  quadratic  non  resid¬ 
uosity  we  will  assume  that  the  modulus  x  is  already  known  (or  has  already  been 
proven)  to  be  a  Blum  integer  and,  unless  explicitly  specified,  that  the  reference 
string  is  made  of  integers  in  ,  instead  than  of  just  ?7-bit  integers.  Techniques 
used,  for  instance,  in  [5]  and  [11],  allow  to  deal  with  the  general  cases  by  los¬ 
ing  only  a  constant  factor  in  the  length  of  the  reference  string,  and  preserving 
perfect  zero  knowledge. 


3.1  Quadratic  non  residuosity 

We  present  a  perfect  zero-knowledge  proof  system  (A,B)  with  security  parameter 
k  that  uses  a  reference  string  of  length  0{nk)  for  proving  that  any  polynomial 
number  777(77)  of  elements  7/i,  • .  . ,  7/m(n)  ^-re  quadratic  non  residues  modulo  an 
integer  x  of  length  n. 

The  proof  system  of  [5]  for  one  statement.  The  non-interactive  perfect 
zero-knowledge  proof  system  of  [5]  for  proving  one  quadratic  non  residuosity 
statement  of  size  n  uses  a  reference  string  of  length  nk.  On  input  a  pair  {x,  y), 
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where  x  is  a  Blum  integer  and  y  an  element  of  ,  the  reference  string  is 

viewed  as  the  concatenation  of  k  elements  zi  o  •  •  -  o  of  Z^^ .  If  t/  is  a  quadratic 
non  residue,  then  for  each  j,  exactly  one  of  zj  and  yzj  mod  a?,  call  it  Uj ,  is  a 
quadratic  residue  and  the  prover  gives  a  random  square  root  of  uj .  The  soundness 
of  the  proof  system  relies  on  the  fact  that  if  y  is  a  quadratic  residue  and  zj  is 
a  quadratic  non  residue  then  neither  zj  and  yzj  mod  a;  is  a  quadratic  residue 
and  thus  the  prover  cannot  satisfy  the  verifier’s  verifications.  Since  the  zj's  are 
chosen  at  random  and  since  exactly  half  of  the  elements  in  Z^^  are  quadratic 
non  residues,  the  prover  has  probability  2^^  of  making  the  verifier  accept  when 
y  is  a  quadratic  residue. 

Proving  many  statements.  We  modify  the  above  described  proof  system 
in  such  a  way  that  the  following  two  properties  are  satisfied:  1)  the  prover  can 
generate  exactly  one  proof  for  each  input  and  each  reference  string;  2)  each  proof 
has  the  same  distribution  as  the  reference  string.  We  use  the  following  definition. 
Let  be  a  Blum  integer;  for  z  G  Z+^  and  6  G  {0, 1},  define  u  =  sqrt{x,  z,  b)  as 
the  integer  u  G  2'+^  such  that  (a)  v?  —  zmod  x  and  (b)  if  6  —  0  then  u  <  xj2 
else  XI  >  xj2.  Now  we  give  a  formal  description  of  our  proof  system  (A,B). 


Input  to  A  and  B: 

•  A  k{72  4-  l)-bit  reference  string  cr  =  zi  o  •  *  •  o  za;  o  6i  o  •  •  *  o  6fc,  where  Zj  G  Z^  , 

bj  G  {0, 1},  for  j  =  1,.  ^ 

•  An  (m  +  l)-tuple  (a;,  yi , . . . ,  ym),  where  |a;|  =  n,  y*  £  Z^\  for  t  ^  I,  . . . ,  m. 

Input  to  A:  x^s  factorization. 

Instructions  for  A. 

A.l  Set  uij  -  Zj,  bi,j  =  bj,  for  j  =  1, . . . ,  k. 

A. 2  For  z  =  1, . . . ,  m, 

for  i  =  1, . . .  ,A:, 
if  ui,j  G  QRjc  then 

compute  Ui+ij  =  sqrt{x,  mj  ^bij)  and  set  bi-^-ij  =  0; 
if  xii^j  G  NQRx  then 

compute  =  sqrt{x^y  •  Ui^j  mod  x^hi^j)  and  set  6t+i,j  =  1; 

set  Prooft  —  (ui+1,1 ,  •  •  •  >  ,  •  •  •  j  bi+i,k)’ 

A. 3  Send  (Pj'oofi , . . . ,  Proof m)  to  B. 


Input  to  B:  A  sequence  of  proofs  (Proofi,...,Proofm),  where  Prooft  — 
, . . . ,  . . .  ,bt+\,k),  Ut+i,j  €  Z^^,  bi+i,j  G  {0,  l},  for  j  =  1, . . . ,  k. 

Instructions  for  B. 

B.l  Set  ui  J  =  Zj,  bi,j  =  6j  ,  for  i  =  1, . . . ,  k. 

B.2  For  z  =  1,  . . .  ,  m,  and  j  =  1,  . .  . ,  A:, 

verify  that  •  Ui,j  mod  x. 

B.3  If  all  verifications  are  satisfied  then  output:  ACCEPT  else  output:  REJECT. 


Completeness,  Soundness  and  Perfect  Zero  Knowledge:  intuition.  The 

completeness  property  is  not  hard  to  check.  To  prove  soundness  and  perfect 
zero-knowledge,  the  following  characterization  of  the  distribution  of  a  proof  for 
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a  quadratic  non  residue  is  useful.  The  i-th  proof  Proofi  is  a  string  of  k  in¬ 
tegers  Uj+ij  in  and  k  bits  bi+ij,  such  that  the  each  Ui+ij  is  uniformly 

distributed  (and  so  is  its  quadratic  residuosity)  and  each  bit  bi+ij  is  also  uni¬ 
formly  distributed.  The  soundness  of  (A,B)  can  then  be  proved  by  induction  on 
the  number  m  of  integers  The  base  case  is  simple;  for  the  inductive  case,  we 
assume  that  yi, . . . ,  yi_i  are  quadratic  non  residues  modulo  x,  and  that  y,:  is  a 
quadratic  residue,  and  use  the  above  characterization  of  the  distribution  for  the 
proof  for  y,:_i,  that  is  also  the  reference  string  to  be  used  for  proving  y^.  The 
perfect  zero-knowledge  of  (A,B)  can  be  proved  by  generating  the  m  proofs  start¬ 
ing  from  the  last  one,  using  the  above  characterization.  Here  the  main  difficulty 
consists  in  simulating  the  generation  of  a  square  root  Ui+ij  of  -r/.j  mod  x 

which  is  less  than  a^/2  or  not,  according  to  the  value  of  the  random  bit  bij  taken 
from  the  reference  string.  The  generation  is  accomplished  as  follows.  The  sim¬ 
ulator  will  first  choose  bit  bij  at  random  and  Ui+ij  e  and  then  compute 
Uij  such  that  u?  •  Ui+ij  mod  x]  now,  the  value  of  bit  bij  is  then  de¬ 

termined  depending  on  whether  Uij  is  greater  or  smaller  than  x/2.  It  is  possible 
to  see  that  if  x  is  a  Blum  integer  and  y  is  a  quadratic  non  residue,  then  bit  bij 
(or  in  other  words,  the  predicate  saying  whether  Uij  <  x/2  or  not)  is  uniformly 
distributed,  no  matter  how  quadratic  residues  are  distributed  in  Z'^^ .  We  obtain 
the  following 

Theorem  4.  (A,B)  is  a  non-interactive  perfect  zero-knowledge  proof  system,  with 
security  parameter  k  that  can  prove  any  polynomial  number  of  quadratic  non 
residuosity  statements,  each  of  size  n  and  uses  a  reference  string  of  length  &{kn). 


3.2  Expected  vs.  strict  polynomial  time  simulators 

The  zero- knowledge  requirement  in  the  definition  of  a  non-interactive  zero- 
knowledge  proof  system  requires  the  simulator  associated  to  the  proof  system 
to  run  in  expected  polynomial  time.  We  can  transform  any  non-interactive  zero- 
knowledge  proof  system  into  one  having  the  additional  property  that  the  sim¬ 
ulator  runs  in  strict  polynomial  time.  The  transformation  preserves  the  kind  of 
zero-knowledge,  i.e.,  computational  or  perfect.  We  obtain  the  following 

Theorems.  Let  L  be  a  language  having  a  non-interactive  zero-knowledge  proof 
system.  Then  L  has  a  non-interactive  zero-knowledge  proof  system  such  that  the 
simulator  associated  runs  in  strict  polynomial  time. 

3.3  A  general  class  of  languages 

In  this  subsection  we  show  a  non-interactive  perfect  zero-knowledge  proof  system 
for  proving  many  statements  on  a  fixed  reference  string,  which  applies  to  some 
general  class  of  languages,  not  necessarily  depending  on  number-theoretic  prop¬ 
erties.  We  start  with  an  informal  discussion,  and  then  define  a  class  of  languages 
and  give  a  protocol  for  all  languages  in  such  class. 
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An  informal  discussion.  Generalizing  the  proof  system  of  previous  section,  an 
idea  to  construct  a  randomness-efRcient  protocol  for  proving  many  statements  in 
non-interactive  perfect  zero-knowledge  would  be  the  following:  a  first  statement 
is  proved  on  a  given  reference  string  ai  and  then  the  proof  itself  is  used  in 
order  to  compute  a  new  reference  string  for  the  next  statement  X2,  and  so  on. 
Specifically,  instead  of  using  the  proof,  whose  structure  is  not  known  in  general, 
we  would  like  to  use  the  randomness  needed  by  the  simulator  to  simulate  a 
proof  for  .x’l  in  order  to  compute  a  new  reference  string  for  the  next  statement  X2. 
Notice  that  because  of  Theorem  5,  we  can  assume  that  the  amount  of  randomness 
needed  by  the  simulator  to  simulate  a  proof  is  a  fixed  and  well  defined  quantity. 
Simulator- Rankable  languages.  Let  L  be  a  language  and  let  (A,B)  be  a  non- 
interactive  perfect  zero-knowledge  proof  system  for  L;  also,  denote  by  M  the 
simulator  associated  to  (A,B),  by  cr  the  reference  string,  by  x  the  common  input, 
and  by  5m, a, .r  the  set  {R\M(R,x)  =  {a,  Proof)}.  If  \x\  =  n,  let  \R\  =  r{n), 
|a-|  =  s(n)  and  \SM,a,x  \  =  2^^”^  (we  can  assume  a  fixed  length  r(n)  for  string  R 
because  of  Theorem  5).  We  say  that  (A,B)  is  simulator-rankable  if  there  exists  a 
polynomial-time  computable  function  F  :  {0, 1}”  x  {0,  {0,  such 

that  if  X  e  L  then  F{x,R)  is  the  rank  of  R  in  set  Sm,<t,x,  where  a  is  such  that 
M{R,x)  —  {a,  Proof).  We  say  that  language  L  is  simulator-rankable  if  there 
exists  a  non-interactive  perfect  zero-knowledge  proof  system  (A,B)  for  L  which 
is  simulator-rankable. 

A  protocol  for  any  simulator-rankable  language.  Let  L  be  a  simulator- 
rankable  language;  now  we  describe  a  non-interactive  perfect  zero-knowledge 
proof  system  (P,V)  for  proving  any  polynomial  number  m  =  m{n)  of  membership 
statements  of  size  n  to  L  which  uses  a  fixed  reference  string.  By  ranks{x)  we 
denote  the  rank  of  element  x  in  set  5.  Now  we  give  a  formal  description  of  (P,V). 


Input  to  P  and  V:  n-bit  strings  xi, . . .  ,XTn,  and  an  r(n)-bit  string  (t. 
Instructions  for  P: 

P.l  Set  Ti  =  a. 

P.2  For  i  =  1, . . . ,  m  , 

write  r,  =  7t  o  indi,  where  |7i|  =  s(n)  and  \indi\  —  r{n)  —  s{n)] 
compute  Ri  £  lhat  ranksj^^^.^^.iRi)  =  indi; 

set  Tt+i  =  Ri. 

P.3  Send  (ri , .  . . ,  r,n+i )  to  V. 


Input  to  V:  a  sequence  of  r(n)-bit  strings  (n, . . . ,  rm+i). 

Instructions  for  V: 

V.l  Set  Ti  =  (T. 

V.2  For  i  —  rn, .  . . ,  1, 

write  Ti  =  ')i  o  mdi,  where  |7i|  =  s(n)  and  \indi\  =  r{n)  —  s(n); 
set  Rt  =  r*+i  and  =  M{Rt,Xi)\ 
check  that  ai  =  7*  and  F{x,  Ri)  =  indi. 

V.3  If  all  verifications  are  successful  then  output;  ACCEPT  and  halt,  else  output: 
REJECT  and  halt. 
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We  obtain  the  following 

Theorem  6.  Let  L  be  a  simulator- rankable  language  and  let  (A,B)  be  a  simulator- 
rankable  non-interactive  perfect  zero-knowledge  proof  system  for  L.  Then  (P,  V) 
is  a  non-interoctive  perfect  zero-knowledge  proof  system  that  can  prove  any  poly¬ 
nomial  m  =  m{n)  number  of  membership  statements  each  of  size  n  and  uses  a 
reference  string  of  length  r(n),  (that  is,  not  depending  on  m),  where  r(n)  is  the 
length  of  the  random  string  used  by  the  simulator  M  associated  to  (A,B). 

Examples  of  simulator-rankable  languages.  A  first  example  of  a  simulator- 
rankable  language  is  the  language  of  quadratic  non  residuosity  modulo  Blum 
integers.  This  can  be  seen  by  using  the  protocol  in  [5],  revised  in  Section  3.1:  for 
each  reference  string  there  exist  exactly  2^  random  strings  R  in  set  SM,a,x, 
since  each  integer  Zi  E  might  have  been  generated  from  two  different  square 
roots:  Vi  and  -ri  mod  x.  This  allows  to  compute  the  rank  of  any  random  string 
in  Sm,(7,x,  for  ^ny  reference  string  cr.  Later,  in  Section  4.2  we  show  that  the 
language  of  all  l-out-of-3  thresholds  over  quadratic  non  residuosity  is  simulator- 
rankable.  Using  this  fact,  we  can  show  the  same  for  the  language  of  ^-out-of-m 
thresholds  over  quadratic  non  residuosity  [11]  and  for  the  language  of  all  secret- 
sharing  based  compositions  over  quadratic  non  residuosity  [10].  Also,  it  is  easy 
to  see  that  the  language  of  all  elements  in  a  family  of  trapdoor  permutations 
[4]  is  simulator-rankable.  This  implies  that  all  known  languages  having  a  non¬ 
interactive  perfect  zero-knowledge  proof  system  are  simulator-rankable. 

4  A  randomness-efficient  protocol  for  NP 

We  start  by  reviewing  the  non-interactive  zero-knowledge  proof  system  for  the 
NP-complete  language  3SAT  given  in  [5].  We  will  denote  by  k  the  security  pa¬ 
rameter  of  the  proof  system,  by  n  the  number  of  variables  and  by  m  the  number 
of  clauses  of  the  3-SAT  input  formula  </>.  Also,  we  choose  the  size  of  the  Blum 
integer  used  as  a  modulus  to  be  equal  to  k. 

The  protocol  in  [5]  for  3SAT.  The  non-interactive  zero-knowledge  proof 
system  for  3SAT  given  in  [5]  uses  a  reference  string  of  length  &[kn^)  and  can  be 
divided  into  three  steps. 

1.  Committing  to  truth  values.  First  of  all  the  prover  uniformly  chooses  a  k- 
bit  Blum  integer  x  and  a  quadratic  non  residue  y.  Then,  using  x,y,  and 
a  satisfying  assignment  t  for  variables  in  (p,  the  prover  assigns  an  integer 
yi  E  to  each  literal  U  in  (p  in  such  a  way  that  if  ?/  is  a  quadratic  non 
residue  modulo  x,  then  the  following  is  true:  yi  is  a  quadratic  non  residue 
modulo  X  if  and  only  if  literal  f  is  true  under  the  assignment  t. 

2.  Proving  that  the  commitments  are  consistent.  Here  the  prover  sends  a  non¬ 
interactive  zero-knowledge  proof  that  ic  is  a  Blum  integer  and  y  is  a  quadratic 
non  residue  modulo  x. 

3.  Proving  that  clauses  are  satisfied.  For  each  clause  [In  V  1(2  V  /js)  of  (p,  the 
prover  proves  that  at  least  one  of  yii,yi2,yi3  is  a  quadratic  non  residue 
modulo  X,  where  integer  yij  was  assigned  to  literal  . 
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Our  contribution.  We  give  a  significantly  different  implementation  of  the  first 
and  third  step  in  the  above  protocol,  and  obtain  the  following 

Theorem  7.  Under  the  quadratic  residuosity  assumption,  there  exists  a  non- 
interactive  computational  zero-knowledge  proof  system  with  security  parameter 
k  for  3SAT,  using  a  reference  string  of  length  0{nk),  where  n  is  the  number  of 
variables  of  the  input  formula. 

Now  we  informally  describe  our  implementation  of  the  first  and  third  steps  of 
the  above  protocol,  omitting  a  formal  description.  We  remark  that  our  protocol 
satisfies  also  the  requirement  of  strong  soundness,  that  is,  it  is  sound  also  if  a 
malicious  prover  chooses  the  statement  after  seeing  the  reference  string. 


4.1  Committing  to  the  truth  values  of  the  literals 

Let  t  be  an  assignment  for  variables  vi, .  . .  ,Vn'\n.  the  3SAT  formula  <j>\  let  x  be  the 
input  modulus  and  let  o  •  •  •ou;„  be  a  portion  of  the  random  string,  where  each 
Wi  e  Also,  denote  by  qi  the  quadratic  residuosity  of  Wi,  for  each  i  =  1, . . . ,  n. 

Then  the  prover  P  commits  to  each  Vj  and  vj  as  follows.  For  each  i  =  1, . . . ,  n, 
P  sets  di  =  t{vi)^qi,  tconii  =  •  Wi  mod  x  and  ncomi  =  y-tcorui  mod  x.  The 

commitments  are  then  [vi ,  tcom-i),  [vi,  ncomi),  for  i  —  1, . . . ,  n.  It  is  easy  to  check 
that  tcom-i  {nconii)  is  a  quadratic  non-residue  if  and  only  if  variable  Vi  (vi)  is 
true  under  assignment  t.  We  remark  that  the  above  commitments  are  generated 
using  integers  from  the  reference  string,  while  in  [5]  they  were  generated  from 
the  prover  by  using  some  private  randomness.  In  our  analysis,  this  will  decrease 
significantly  the  cheating  power  of  a  dishonest  prover  and  will  allow  us  to  use 
a  shorter  reference  string  in  the  proof  system  for  proving  that  the  clauses  have 
been  correctly  constructed. 


4.2  Proving  that  the  clauses  are  satisfied 

In  order  to  prove  that  a  single  clause  is  satisfied,  we  use  a  non-interactive  perfect 
zero-knowledge  proof  system  for  the  language  3-OR(NQRa:)  of  triples  (j/i ,  j/2,  J/s) 
such  that  at  least  one  out  of  yi,y2,  Vs  is  a  quadratic  non  residue  modulo  the  Blum 
integer  x.  We  do  not  yet  know  whether  such  language  is  simulator-rankable,  since 
it  is  not  clear  how  to  use  the  two  protocols  given  in  [14,  11]  for  this  language 
in  order  to  derive  such  property.  Here  we  describe  a  non-interactive  perfect 
zero-knowledge  proof  system  (A,B)  for  language  3-OR(NQRa;),  which  allows  to 
conclude  that  such  language  is  simulator-rankable,  and  thus  allows  to  prove  all 
m  clauses  of  formula  (j)  on  one  fixed  random  string. 

An  informal  description.  We  start  with  some  definitions.  Let  x  be  a  Blum 
integer  and  61,62,63  6  {0,1};  we  say  that  a  triple  (zi,Z2,Z3,)  of  integers  in 
has  quadratic  character  (61,62,63),  if  Qx{zi)  =  bi,  for  i  =  1,2,3.  Also, 
we  say  that  two  triples  (2/1, 2/2,  ys)  and  {zi,Z2,zs)  of  integers  in  have  dif¬ 
ferent  quadratic  characters  if  the  two  triples  of  bits  representing  the  quadratic 
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characters  of  (yi ,  y2,  Vs)  and  (zi,  2:2,  2:3)  are  different.  Finally,  we  define  the  OR- 
triples  of  (t/i ,  2/2, 2/3),  for  any  triple  {y\ ,  ^2, 2/3)  of  integers  in  as  the  7  triples 
(2/i,y2,y3),  (2/1^22/3,  2/12/3,  2/1),  (2/2,2/3,2/12/2),  (2/3,2/12/2,  2/22/3),  (2/12/2,  2/22/3,  2/12/22/3), 
(2/22/3,  2/12/22/3,  2/12/3),  (2/12/3,  2/1, 2/2),  where  all  computations  are  done  modulo  x. 
We  will  use  the  following 

Fact  1  Let  x  be  a  Blum  integer,  and  let  2/1, 2/2, 2/3  ^  OR-triples 

of  {yi,  y2, 2/3)  satisfy  the  following  properties: 

L  // (2/1, 2/2, 2/3)  quadratic  character  {0,0,0)  then  all  OR-triples  0/(2/! ,  2/2, 2/3) 
have  quadratic  character  (0,0,0); 

-V  (2/1, 2/2, 2/3)  quadratic  character  dijjerent  from  (0,  0,  0)  then 

-  each  OR-triple  of{yi,y2, 2/3)  quadratic  character  dijjerent  from  (0,  0,  0); 

-  each  two  OR-triples  0/ (t/i ,  2/2, 2/3)  have  dijjerent  quadratic  character. 

The  proof  system  (A,B)  uses  a  reference  string  of  length  0{nk),  viewed  as  the 
concatenation  of  triples  {zi^i,  Zi^2i  ^(,3)  of  integers  in  ,  for  1,  •  •  •,  On 

input  (i?,  yi,  2/2, 2/3),  fhe  prover  A  computes  the  quadratic  character  (di,i,  di^s) 
of  each  triple  (^i,  1 ,  ^i,2,  ^z-,3).  Now,  if  (d.-.i ,  di,2,  -  (0,0,0)  then  A  computes 

and  sends  to  B  square  roots  of  Zi^i,  Zi^2i  ^i,3-  Instead,  if  {di^i,  di  2,  ^{,3)  7^  (0,  0,  0), 

A  computes  and  sends  to  B  square  roots  of  Zi^i  •  i;i  mod  x,  Zi^2  '  'f2niodai, 
and  ■  1^3  mod  x,  where  (t>i ,  t'2, 1^3)  is  the  OR-triple  with  quadratic  character 
(di,i,df,2,di,3).  The  verifier  B  checks  that  the  square  roots  are  correctly  com¬ 
puted.  A  formal  description  of  (A,B)  is  omitted.  Similarly  as  done  for  the  lan¬ 
guage  of  quadratic  non  residuosity,  we  can  show  that  the  language  3-OR(NQRx-) 
is  simulator-rankable.  Using  Theorem  6,  we  obtain  the  following 

Theorems.  There  exists  a  non-interactive  perject  zero-knowledge  prooj  system 
with  security  parameter  k  Jor  proving  any  polynomial  number  m(n)  oj  member¬ 
ship  statements  Jor  the  language  3-OR(NQRa:)  of  size  n,  which  uses  a  reference 
string  of  length  0{kn). 
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Abstract.  In  this  paper,  we  study  the  optimum  cost  chromatic  partition 
(OCCP)  problem  for  several  graph  classes.  The  OCCP  problem  is  the 
problem  of  coloring  the  vertices  of  a  graph  such  that  adjacent  vertices 
get  different  colors  and  that  the  total  coloring  costs  are  minimum. 

We  prove  that  there  exists  no  polynomial  approximation  algorithm  with 
ratio  0(|V1°  for  the  OCCP  problem  restricted  to  bipartite  and  in¬ 
terval  graphs,  unless  P  —  NP. 

Furthermore,  we  propose  approximation  algorithms  with  ratio  0(jV|  ) 

for  bipartite,  interval  and  unimodular  graphs.  Finally,  we  prove  that 
there  exists  no  polynomial  approximation  algorithm  with  ratio  0{\V\ 
for  the  OCCP  problem  restricted  to  split,  chordal,  permutation  and  com¬ 
parability  graphs,  unless  P  =  NP. 


1  Introduction 

In  this  paper,  we  study  the  optimum  cost  chromatic  partition  (OCCP)  problem 
for  several  graph  classes.  The  graph  classes  used  in  this  paper  are  defined  e.g.  in 
[5].  The  OCCP  problem  can  be  described  as  follows:  Given  a  graph  G  =  (V,  E) 
with  n  vertices  and  a  sequence  of  coloring  costs  find  a  feasible 

coloring  f{v)  for  each  vertex  v  eV  such  that  the  total  coloring  costs 
are  minimum.  A  coloring  /  :  V  — >■  n}  is  feasible  if  adjacent  vertices 

have  different  colors.  Alternatively,  the  OCCP  problem  can  be  formulated  as 
follows:  Given  a  graph  G  =  {V,  E)  with  n  vertices  and  a  sequence  of  coloring 
costs  (/bi,  . . . ,  find  a  partition  into  independent  sets  Ui,...,Us  such  that 
kc  ■  \Uc\  is  minimum.  We  may  assume  that  kc  <  kd  whenever  c  <  d. 

A  VLSI  layout  problem  introduced  by  Supowit  [II]  with  terminals  on  a  circle 
or  on  two  opposite  parallel  lines  corresponds  to  the  OCCP  problem  restricted 
to  circle  or  permiiation  graphs.  Another  application  is  given  by  Kroon  et  al. 
[9].  The  OCCP  problem  for  interval  graphs  is  equivalent  to  the  Fixed  Interval 
Scheduling  Problem  (FISP)  with  machine  dependent  processing  costs.  It  is  not 
difficult  to  see  that  the  OCCP  problem  is  NP-complete  for  arbitrary  graphs.  Sen 
et  al.  [10]  proved  that  the  OCCP  problem  for  circle  graphs  is  NP-complete. 

Kroon  et  al.  [9]  studied  the  OCCP  problem  for  interval  graphs  and  trees.  They 
showed  that  the  problem  restricted  to  trees  can  be  solved  in  linear  time  and  that 
the  problem  restricted  to  interval  graphs  is  NP-complete  even  if  there  are  only 
four  different  values  for  the  coloring  costs.  If  there  are  only  two  different  values 
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for  the  coloring  costs,  then  the  OCCP  problem  is  equivalent  to  the  maximum 
g-colorable  subgraph  problem.  Suppose  that  the  first  q  costs  are  equal  and  that 
the  last  n  —  q  costs  are  equal  {ki  =  . , .  ~  kq  <  kqj^i  =  ...  =  kn)-  Then,  we  get 
an  optimum  solution  if  the  maximum  ^'-colorable  subgraph  is  colored  with  the 
colors  1, . .  .,g  and  if  the  other  vertices  are  colored  with  the  remaining  colors. 
The  maximum  g-colorable  subgraph  problem  has  been  studied  extensively  by 
Frank  [3],  Gavril  [4],  Yannakakis  and  Gavril  [12],  Jansen  et  al.  [6]  and  Chang  et 
al.  [2].  Further  complexity  results  for  the  OCCP  problem  can  be  found  in  [7]. 

We  give  several  approximation  results  for  the  OCCP  problem  restricted  to 
bipartite,  chordal,  comparability,  interval,  permutation,  unimodular  and  split 
graphs.  We  prove  that  there  exists  no  polynomial  approximation  algorithm  with 
ratio  for  the  OCCP  problem  restricted  to  bipartite  and  interval 

graphs,  unless  P  —  NP.  Furthermore,  we  propose  approximation  algorithms 
with  ratio  0(|Y|°-^)  for  both  graph  classes  and  for  unimodular  graphs.  Finally, 
we  prove  that  there  exists  no  polynomial  approximation  algorithm  with  ratio 
0(|P|^“^)  for  the  OCCP  problem  restricted  to  split,  chordal,  permutation  and 
comparability  graphs,  unless  P  =  NP. 


2  Bipartite  graphs 

In  this  section  we  prove  that  OCCP  is  hard  to  approximate  for 

bipartite  graphs.  After  that,  we  propose  an  approximation  algorithm  with  ratio 


2.1  Non-Approximability  result 

We  use  the  precoloring  extension  problem  that  is  NP-complete  for  bipartite 
graphs  proved  by  Bodlaender,  Jansen  and  Woeginger  [1],  Given  a  bipartite  graph 
G  =  (F,  E)  with  vertex  set  V  =  AU  B,  edge  set  E  C  {{t;,  w}\v  E  A,  w  £  B}  and 
three  specified  vertices  ai,a2,  as  G  A,  the  1-PrExt  problem  is  to  decide  whether 
there  exists  a  3-coloring  of  G  with  /(oi)  =  1,  /(a2)  —  2  and  /(as)  =  3. 

First,  we  show  the  NP-completeness  of  the  OCCP  problem  using  an  integer 
parameter  K.  Later,  we  specify  the  parameter  K  to  achieve  our  non  -  approx- 
imability  result. 

Theorem  1.  The  OCCP  problem  for  bipartite  graphs  is  NP-complete  if  there 
are  at  least  four  different  cost  values. 

Proof  The  theorem  is  proved  by  a  reduction  from  1-PrExt  restricted  to  bipartite 
graphs.  We  may  assume  that  G  =  {A  U  B,  E)  contains  three  further  vertices 
^3  G  B  with  {ai,bj}  E  E  fov  1  <  i  ^  j  <  S.  Let  n  be  the  number  of 
vertices  in  G. 

Let  I  be  an  instance  of  1-PrExt  containing  the  bipartite  graph  G  =  (AU5,  E) 
with  0.1,02,  as  E  A  and  61,62,63  E  B  a,s  described  above.  Let  K  be  a  positive 
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integer  with  K  >  1.  An  instance  I'  of  the  OCCP  problem  is  constructed  as 
follows.  First,  we  define  a  bipartite  graph  G'  =  {V\E*)  with  vertex  set 

<  j  <  2000/i^2n}U 
{r;3j/,  ^4^/11  <  /  <  100 A"n}  U 

and  edge  set 

E'  =  {{«»1  <  i  <  2mK^n,  1  <  i'  <  100/<n}U 

<  }'  <  100A'n}U 

{I's.t’e}- 


V\j  V3,j'  Vs  V6  V4,y  V2,3 


Fig.  1.  The  constructed  graph  G'  and  a  feasible  2-coloring  of  G' . 


The  bipartite  graph  G'  illustrated  in  Figure  1  contains  4000A"^n  +  2007srn  +  2 
vertices.  Then,  we  connect  G  and  G'  using  the  following  edges: 

E  =  {bi,V4j'},{bi,V5},  {a2,  ^^2,;}, 

{{^2,  j},  {62,  r^5},  {as,  '^3,;'},  {as,  '^2,^}, 

{63,  vi,j],  {^3,  I  1  <  i  <  2000A2^,  1  <  /  <  lOOAn}}. 

In  total,  the  bipartite  graph  G  for  I'  is  given  by 

G  =  {AUBUV\EUE'UE). 

The  cost  values  are  ki  —  I,  k2  =  10 A",  ^3  =  100 A^  and  ^4  =  15000 A"^n.  A 
cheap  coloring  of  G  has  to  use  only  three  colors;  otherwise  the  costs  would  be 
more  than  15000A"^n. 

We  can  prove  the  following  statements:  7  is  a  yes  instance  of  1-PrExt  if 
and  only  if  the  minimum  total  costs  of  coloring  all  vertices  in  E  don’t  exceed 
61007i  -}-  1007i"^  -f-  1. 

If  there  is  no  solution  of  the  1-PrExt  problem,  then  we  have  either  four  colors 
in  G  with  coloring  costs  of  at  least  ^4  =  150007f^n  or  a  3-coloring  with  coloring 
costs  of  at  least  lOOOOfv^n.  □ 

Theorem 2.  For  each  c  <  ^,  there  exists  no  polynomial  approximation  algo¬ 
rithm.  with  ratio  G(ly|°-^“^)  for  the  OCCP  problem  restricted  to  bipartite  graphs, 
unless  P  —  NP. 
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Proof.  Let  H  be  an  approximation  algorithm  for  the  OCCP  problem  that  com¬ 
putes  a  coloring  with  costs  H{r)  <  c\V\^-^~^OPT(r),  where  c  is  a  constant  and 
OPT(r)  are  the  minimum  costs  of  a  solution  P. 

We  construct  for  an  instance  /  of  the  1-PrExt  problem  an  instance  P  of  the 
OCCP  problem  as  described  in  the  proof  above.  We  obtain  a  graph  with  at  most 
4300/i  “n  vertices.  If  there  exists  a  solution  of  the  1-PrExt  problem,  the  optimum 
solution  of  the  OCCP  instance  P  has  costs  of  at  most  6200A"^n.  In  this  case,  our 
approximation  algorithm  produces  the  value  H{P)  <  6200A"^cn|V'|'^-^“^  Since 
the  number  of  vertices  in  P  is  at  most  4300 we  have 

c|P|0-5-^  <  (4300)°-®Ai-^^cn° 

If  there  exists  no  solution  of  the  1-PrExt  instance,  then  OPT(P)  >  lOOOOA'^n 
and,  therefore,  algorithm  H  generates  a  solution  with  costs  greater  than 
lOOOOA^n.  Next,  we  consider  the  inequality 

lOOOOA^n  >  {m0f  -^6200K^-^^cn^  \ 

This  inequality  is  satisfied  if  and  only  if 

2.  ^  c(4300)°-^6200  o.s 

^  ^  r\r\.f\r\ 


We  define 


c(4300)O-56200 

10000 


„0.5] 


+  1. 


Since  c  and  6  are  constant,  A  is  a  polynomial  in  n  and,  therefore,  the  instance  P 
can  be  constructed  in  polynomial  time.  If  there  exists  no  solution  of  the  1-PrExt 
problem,  then  H  generates  a  solution  with  costs  of  at  least 


lOOOOK^n  >  (4300)°'^6200A:®“^"cn^  ®  >  6200A:^cn|K|“■®"^ 


Therefore,  by  using  the  polynomial  time  approximation  algorithm  A,  we  could 
decide  the  existence  of  a  solution  for  the  1-PrExt  problem,  which  would  imply 


P  =  NP.  □ 


2.2  Approximability  result 

The  key  idea  of  the  approximation  algorithm  is  to  compute  two  colorings  for  the 
problem  and  to  choose  the  cheaper  one. 

Algorithm  A 

given:  Instance  I  of  the  OCCP  problem  containing  a  bipartite  graph  G  =  (V,  E) 
and  cost  vector  (^i, . . . , 

(1)  Compute  a  2-coloring  of  G  with  72i  vertices  colored  with  color  1  and  |C|  — 
vertices  colored  with  color  2  such  that  ni  is  maximum  and,  therefore,  ni  > 
The  costs  of  the  first  coloring  are  Ai(I)  =  ni^i  +  (|P|  —  ni)k2. 
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(2)  Compute  a  maximum  independent  set  C  in  G  with  a(G)  vertices  and  color 

the  vertices  in  U  with  color  1.  Then,  compute  a  2-coloring  of  G[V  \  U] 
with  71 2  vertices  colored  with  color  2  and  |C|  —  cv(G)  —  ti.^  vertices  colored 
with  color  3  such  that  n\  >  The  costs  of  the  second  coloring  are 

,42(7)  -  c,{G)h  +  +  (\V\  -  a(G)  -  n[)k3. 

(3)  Choose  the  cheaper  coloring  among  the  two  colorings. 

We  note  that  the  costs  of  the  second  coloring  are  bounded  by 

a{G)h  +  +  h)  <  a(G)h  +  (\V\  -  a(G))h. 

Theorems.  Algorithm  A  computes  a  solution  of  the  OCCP  problem  restricted 
to  bipartite  graphs  with  approximation  ratio  < 

Proof.  Let  I  be  an  instance  of  the  OCCP  problem  containing  a  bipartite  graph 
G  -  (C,  E)  and  cost  vector  {ku...,  k^vi)-  Then,  we  have  two  lower  bounds  for 
the  optimum  value  OPT{I): 

{l)OPT{I)  >  \V\ku 

(2)  OPT{I)  >  oi{G)ki  +  {\V\ -  a(G))^2. 

We  consider  two  cases  ks  <  ks  >  can  prove  that 

A{I)  <  \V\^'^OPT{I).  □ 

3  Interval  graphs 

In  this  section  we  prove  that  the  OCCP  problem  restricted  to  interval  graphs 
is  hard  to  approximate  with  ratio  0(11/1°-^"').  Furthermore,  we  propose  an  ap¬ 
proximation  algorithm  with  ratio  for  interval  graphs  and  also  for  uni- 

modular  graphs. 


3.1  Non-Approximability  result 

The  NP-completeness  proof  uses  a  reduction  from  Numerical  Three  Dimensional 
Matching  (N3DM)  and  is  a  modification  of  the  pure  NP-completeness  proof  of 
the  OCCP  problem  given  by  Kroon  et  al.  [9]. 

Theorem 4.  For  each  ^  Inhere  exists  no  polynomial  approximation  algo¬ 

rithm.  with  ratio  O(lC|°-^"0  for  the  OCCP  problem  restricted  to  interval  graphs, 
unless  P  =  N P. 

Proof  First,  we  give  a  reduction  from  N3DM  with  variable  parameter  K  G  NI 
and,  later,  we  specify  the  parameter  K  G  NI  to  achieve  our  non-approximability 
result.  Let  h  be  an  instance  of  N3DM  with  integer  t  and  rational  numbers 
0  <  a.i,bi,  Ci  <  1  for  1  <  *  <  ^  with  +  Cj)  =  C  The  N3DM  problem 

is  to  decide  whether  there  exist  permutations  p  and  6  of  such  that 

+  ^p{i)  +  ^5(0  =  1  for  1  <  *  <  C 
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We  choose  further  rational  numbers  A*,  Bj  and  Xij  such  that  all  these  num¬ 
bers  are  different  and  that  4  <  <  5  <  <  6  and  7  <  Xij  <  9  fox  1  <  i,  j  <  t. 

Next,  we  construct  an  instance  h  of  the  OCCP  problem.  We  use  the  intervals 
given  in  Table  1  for  the  interval  graph. 


interval 

interval 

numbers 

(0.5,] 

(ll-ct,13] 

(1.2] 

(2,  A] 

t  times  or  1  <  i  <  f 

(Ai,X„] 

(Xtj,  10  + 

l<i,j  <t 

(X.i.l4] 

1  <  ^ 

(3.5,] 

(0,.4.] 

/  —  1  times  and  l<i<torl<j<t 

—  t  times 

t  times,  0  <  /  <  200dA^"? 

'.9nnnA'2f4  i 

'  2000  ft's t4  )  1  2000f<r2t4J 

(  r  T+i  ] 

(^A  i+1  14  . i .  1 

—  t  times,  0  <  /  <  lQ9Kt^ 

2000  r<'2^4  5  2000f<'2*4J 

Table  1,  The  intervals  in  the  interval  graph 


Furthermore,  there  are  t  colors  with  costs  1,  /  colors  with  costs  10/lT^, 

colors  with  costs  100 and  all  other  colors  have  costs  20000FC^^®. 

The  first  claim  (see  also  [8])  is  to  prove  the  following  statement:  7i  is  a  yes 
instance  of  N3DM  if  and  only  if  the  minimum  total  costs  of  coloring  all  intervals 
of  I2  do  not  exceed 

costs{K)  —  2000K^t^  -h  -h  2300/^:2^®  -f  50/^^  -  50/1 

If  /i  is  a  no  instance  of  N3DM,  then  the  total  costs  of  coloring  all  intervals  of  I2 
are  greater  than  10000/C^t^.  We  notice  that  the  value  costs{K)  is  bounded  by 
4355/i2^®. 

The  second  part  of  the  proof  is  the  specification  of  the  parameter  K  to 
achieve  our  non-approximability  result.  We  define 

r  ^/(4208)°;°4355  1 

I  V  10000  ^ 

and  get  our  non-approximability  result  (see  also  [8]).  □ 


3.2  Approximability  result 

Next,  we  propose  an  approximation  algorithm  A  with  ratio  0(|V^p-^)  for  the 
OCCP  problem  restricted  to  interval  graphs  or  to  unimodular  graphs.  The  key 
idea  is  to  analyse  the  structure  of  the  optimum  solution  and  to  solve  a  special 
coloring  problem. 

Suppose  that  the  optimum  solution  consists  of  bopt  >  x{G)  colors.  Further¬ 
more,  we  assume  that  the  colors  aopt,o,opt  +  cover  at  least 
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vertices  and  that  the  colors  +  1,  •  •  • ,  ^opt  cover  less  than  \^/\V\]  vertices  of 

G.  This  implies  that  aopt  £  {1,  •  •  • ,  ^op<}-  Let  na^pt,bopi-a^.pt  Le  the  number  of 

vertices  colored  with  colors  ,  Gopt  a.nd  let  Ga^pi,bopi-aopt  Le  the  number  of 

vertices  colored  with  the  other  colors  Gopt  +  1, . . . ,  bppt-  Therefore,  fia^p^^b opt- aopt 

is  bounded  by 

Using  these  assumptions,  we  obtain  the  following  lower  bounds  for  the  min¬ 
imum  costs  OPT{I)  of  a  coloring: 

(1)  opr(/)  >  rvWll 

(2)  OPT(I)>h^,.. 

The  first  inequality  is  satisfied,  since  vertices  are  colored  with  the  colors 

aopt,  aopt  +  l,...,bopt  and  since  <  •••  <  The  second  in- 

equality  follows  from  the  fact  that  color  bopt  occurs  at  least  once  in  the  optimum 
coloring. 

For  our  approximation  algorithm  we  have  to  solve  the  following  graph  theo¬ 
retical  problem  (called  maximum  [G,b  —  G)-colorable  subgraph  problem). 

Maximum  (a,  b  -  a)-colorable  subgraph 

Given:  A  graph  G  -  (U,  E),  and  numbers  a,  6  G  NI  with  a  <  6  and  b  >  x{G)- 
Question:  Compute  a  partition  (V' ,V  \  V')  of  V  such  that  V'  has  maximum 
cardinality  and  can  be  colored  with  a  colors  and  V\V'  can  be  colored  with 
b  —  G  colors. 


Let  H  be  an  optimum  algorithm  to  solve  the  maximum  (a,6  ~  a)-colorable 
subgraph  problem.  A  call  of  this  algorithm  with  parameters  a  and  b  is  denoted 
by  H{a,b  -  a).  Note,  that  the  maximum  (a,  6  —  a)-colorable  subgraph  problem 
is  harder  as  the  maximum  g-colorable  subgraph  problem.  This  implies  that  the 
decision  problem  corresponding  to  the  maximum  (a,  6  —  a)-colorable  subgraph 
problem  is  NP-complete  for  e.g.  split  graphs,  undirected  path  graphs  and  their 
complements  and  for  ^-trees  with  unbounded  k. 

We  have  proved  the  following  results: 


Theorem  5.  (1)  The  maximum  (a,  6  -  a)-colorable  subgraph  problem  for  inter¬ 
val  graphs  is  solvable  in  polynomial  time  using  a  mincost  flow  algorithm. 
(2)  The  m.axim.um,  (a,  6  -  G)-colorable  subgraph  problem  for  unimodular  graphs 
is  solvable  in  polynomial  time  using  a  linear  program. 


We  denote  by  aa,b-a{G)  the  maximum  cardinality  of  such  a  subset  V  and 
with  aa,b-a(G)  the  number  of  vertices  in  U  \  W.  Clearly,  Oiaopi,hopt-aopt{G)  > 
n  j,  and  h  ^  (G)  <  ha  .  h  ^-a  Given  a  solution  with  sets 

U'  (and  V  \  U'),  a  coloring  with  at  most  g  (and  b  —  g)  colors  can  be  computed 
with  an  optimum  coloring  algorithm  for  several  classes  of  graphs  (e.g.  interval 
or  unimodular  graphs).  Since  the  colors  Gopt  +  1,  •  •  -  ^bopt  cover  less  than  \/\V\ 
vertices  in  the  optimum  solution  and  since  0Caopt,bcpt-aopi{G)  <  > 

the  value  Qaopt,bopt-aoptiG)  is  bounded  by  ^/\V\. 
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Let  Cnictopt^  ^opt  —  (^opt)  be  the  costs  of  a  coloring  computed  by  a  call  of  the 
algorithm  H{aopt,  bopt  —  «opt)  and  a  corresponding  coloring  algorithm.  Then,  we 
can  bound  the  costs  Cniaopt^bopt  -  ctopt)  using  the  lower  bounds  (1)  and  (2)  as 
follows: 


Cfji^opt )  ^Opt  ^(^optj^opi  ^opt  (G).  ^Clcpt  ”b  ^ Qopt  opi  ^opt  (G)  •  ki, 


<\V\-ka^^VW\-kt^ 

<  +  VW  ■  kK, 

<  2^y\V\OPT(I). 


opt 


For  the  approximation  algorithm  for  the  OCCP  problem  the  values  a  and 
b  —  a  can  be  bounded  by  x{G).  If  aopt  >  xi^),  the  optimum  costs  OPT(I)  are 
greater  than  \\/\y\]ky^(G)-  la  this  case,  we  get  an  approximate  solution  with 
a  —  x(G)  and  b  =  a  using 

G/,(x(G),0)  <  a^^a)fi{G)k^ia)  <  V\V\yW\kx(G)  <  ^/\V\OPT{I). 

If  bopt  >  x{G)  +  oLopt  and  aopt  <  x{G)^  then  the  optimum  costs  OPT{I)  > 
ky^i^Q'^^aapf  la  this  case,  we  get  an  approximate  solution  with  a  =  aopt  and 
b  -  aPx{G)  using 

C„{a,pt,x{G))  <  +  ^/\V\k^^G)+..^,  <  ^V\V\OPT(I). 

These  arguments  imply  that  at  most  0{x{G)^)  calls  of  the  maximum  (a,  b—a) 
-  colorable  subgraph  are  sufficient  for  our  approximation  algorithm.  In  the  next 
part  of  this  section  we  improve  this  bound.  We  show  that  at  most  0(logx(G)) 
calls  of  the  maximum  (a,  6  ~  a)  colorable  subgraph  algorithm  H  are  needed. 

For  each  a  G  {!,...,  x(G')  —  1},  let  x  be  the  smallest  integer  with  x  G 
x(G)}  (if  possible)  such  that  Q'a,a:(G)  <  [a/IKI] •  We  notice  that 
^a,x(G)(^)  greater  than  Ibat 

d:a,l(G)  >  Ota, 2(0)  >  .  .  .  >  ^a,x(G){G). 

For  a  e  x{G)  —  1}  we  define 

first{a)  =  P  “o.x(G)(<^)  <  [GMl 

1  oo  otherwise 


Since  G  can  be  colored  with  x(^)  colors,  we  define  first(xiG))  =  0. 


Lemma  6.  If  a  <  a'  and  first{a),first{a')  ^  oo  then  first(a^)+a'  <  first{a)-I 
a. 


The  smallest  d  with  first{d)  <  oo  can  be  found  using  binary  search  with  calls 
H{a,xiG)).  Therefore,  a  can  be  found  with  0(logx(G))  calls  of  the  maximum 
o-colorable  subgraph  algorithm.  We  notice  that  first(a')  <  oo  for  each  a'  G 
{a, . . . ,  x(G)}.  This  implies  that  the  mapping  (p  :  a  ^  first{a)  +  a  is  non¬ 
increasing. 


735 


fir$t{a)  +  a. 


large{a) 

B, 


large{x{G)) 


\  \ 

ai  (i2 


-  first{x(G))  +  x{G) 

_ ^  a 

B2 


Fig.  2.  The  mappings  <p  :  a  fir3t{a)  +  a  and  ij)  \  a  large{a) 


For  each  a  G  {1, . .  . ,  x(6^)},  let  x  be  the  smallest  number  in  {a  +  1, . . . ,  |y|} 
(if  existent)  with  kx  >  Notice  that  k^v]  can  be  smaller  than  y/\V\ka. 

We  define 


large{a 


X  if  k\v\  >  \/\V\ka 
oo  otherwise 


For  x(G)  =  \V\,  we  define  large{\V\)  =  oo. 


Lemma  7.  If  a  <  a'  then  large(a)  <  large{a'). 

This  Lemma  implies  that  the  mapping  'll)  \  a  large(a)  is  non-decreasing. 
Next,  we  define  two  regions  Bi  —  {a  £  {d,  . . .  ,x{G)}\large{a)  <  first{a)  +  a} 
and  B2  =  {ae  {a, . . .  ,x{G)}\large(a)  >  first  (a)-]- a}.  We  define  ui  =  max{Bi) 
and  a.2  =  m.in{B2)  (if  the  corresponding  sets  are  non-empty).  In  Figure  2,  we 
have  illustrated  the  mappings  ip  :  a  large{a)  and  ip  :  a  first{a)  +  a. 
Since  the  mappings  p  and  ip  are  non-increasing  and  non-decreasing,  for  each 
pair  a  G  Bi ,  a/  G  B2  we  have  a  <  a'. 

Consider  an  optimum  solution  with  parameters  {aopt,bopt)  where  Gopt  < 
X{G).  We  prove  that  is  sufficient  to  compute  at  most  two  solutions  using  a 
(ai,  first{ai))  and  a  {a2,  first {a2))  maximum  colorable  subgraph. 


LemmaS.  Lei  (V^',C\  W)  be  an  optimum  solution  with  parameters  {oopt^bopt) 
such  that  Oopi  <  x{G).  If  Bi  0  and  B2  #  0  then 

min{CH(ai,  first {ai)),CHici2,  first {02))  <  2y/\V\OPT{I). 


If  Bi  =0,  then  02  =  a  and 

CHia,first{a))  <  2yW\OPTiI). 
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Proof.  We  analyse  the  case  with  Bi  ^  ^  and  B2  ^  the  other  case  with  =  0 
follows  then  directly.  Clearly,  we  have  <22  =  Oi  +  1.  We  have  to  consider  three 
cases. 

Case  1:  large{aopt)  <  first(aopi)  +  aopt  and  Uopt  >  a.  In  this  case,  the 
optimum  solution  lies  in  the  first  region  Bi.  Notice  that  first{aopt)~{-aopt  <  ^opt- 
Since  the  mapping  7?  :  a  — *■  first(a)  -f  a  is  non-increasing  and  cti  >  Oopt,  we 
have  first(ai)  4-  fli  <  first{aopt)  +  Uopi-  Moreover,  it  holds  that  large{ai)  < 
first{ai)-\-  ai  and  that  kjarge(ai)  > 

The  costs  Cniau  first{ai))  <  \V\ka^  -}-  \/^\kfirst(ai)+ai-  Using  V^^ai  < 
kiarge(ai)  ^ud  large(ai)  <  first{ai)  +  cti  <  first{aopt)  4  flopt  <  ^opt,  we  obtain 

Cniaufirstiai))  <  <  2y/\V\OPT(I). 

Case  2:  large(aopt)  >  f irst (aopt)  aopt  and  appt  >  a.  In  this  case,  the  opti¬ 
mum  solution  lies  in  region  B2  and  similar  as  above  we  get  Ch{(^2,  first{a2))  < 
2^\OPTiI). 

Case  3:  Oopt  <  a.  In  this  case,  we  get  a  contradiction.  □ 

The  next  Lemma  implies  how  the  values  ai  and  02  can  be  computed  using 
binary  search  and  calls  Jf(a,  large(a)  —  a)  and  H{a,  large{a)  —  a  —  1). 

Lemma  9.  Lei  a  G  {d, . .  . ,  Using  calls  H{a,large{a)  —  a)  and 

H{a,  large{a)  —  a  —  1)  we  can  decide  whether  a  ^  Bi  or  a  E  B2. 

Now,  we  are  ready  for  our  approximation  algorithm. 

Algorithm  B 

(1)  compute  large{a)  for  each  a  G  {1, .  . . ,  x(G^)}  (using  preprocessing  in 
0(x{G)P\V\)  time), 

(2)  compute  a  first  solution  with  the  call  jy(x(G),0)  (this  is  an  arbitrary  col¬ 
oring  for  the  case  Oopt  >  x{G))^ 

(3)  find  the  smallest  a  with  first{a)  <  00  using  binary  search  with  calls 
H{a^x{G))  (these  are  maximum  a  -  colorable  subgraphs), 

(4)  compute  oi  (if  existent)  and  02  using  binary  search  with  calls 
H(a,large{a)  —  a)  and  H{ajarge{a)  —  a  —  1)  (these  are  maximum  (a,ar)- 
colorable  subgraphs), 

(5)  compute  firsi{ai)  (if  ai  exists)  and  first{a2)  using  binary  search  with  calls 
H{ai,x)  (these  are  maximum  (ai,  a:)-colorable  subgraphs), 

(6)  choose  the  cheapest  solution  among  the  solutions  with  costs  Ch(^i, 
first{ai))  (if  fli  exists),  Cji(a2,  first (02))  and  Ci? (x(G^),  0). 

Using  the  calculations  above,  we  obtain  the  following  result. 

Theorem  10.  The  approximation  algorithm  B  above  computes  a  coloring  of  the 
OCCP  problem  restricted  to  interval  graphs  (and  unimodular  graphs)  with  ap¬ 
proximation  ratio  0(|U|°‘^). 

The  time  complexity  of  this  algorithm  for  interval  graphs  is  given  by 
^(logx(C'))  calls  of  a  minimum  cost  flow  algorithm.  For  unimodular  graphs, 
we  need  at  most  0(logx(G')))  calls  of  a  linear  programming  algorithm. 
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4  Other  perfect  graphs 

In  [8],  we  have  proved  the  following  further  results  for  the  OCCP  problem  re¬ 
stricted  to  chordal,  comparability,  permutation  and  split  graphs. 

Theorem  11.  For  each  e  <  1,  there  exists  no  'polynomial  approximation  al¬ 
gorithm  with  ratio  0(|1/|^“^)  for  the  OCCP  problem  restricted  to  permutation 
graphs  (and  to  comparability  graphs),  unless  P  —  NP. 

Theorem  12.  For  each  e  <  \,  there  exists  no  polynomial  approximation  algo¬ 
rithm  with  ratio  for  the  OCCP  problem  restricted  to  split  graphs  (and 

to  chordal  graphs),  unless  P  =  NP. 


Acknowledgement.  I  thank  Thomas  Erlebach  (TU  Miinchen)  for  his  help¬ 
ful  comments  and  many  fruitful  discussions. 
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The  Minimum  Color  Sum  of  Bipartite  Graphs* 

Amotz  Bar-Noy**  Guy  Kortsarz^ 


Abstract.  The  problem  of  minimum  color  sum  of  a  graph  is  to  color  the 
vertices  of  the  graph  such  that  the  sum  (average)  of  all  assigned  colors 
is  minimum.  Recently,  in  [BBH+96],  it  was  shown  that  in  general  graphs 
this  problem  cannot  be  approximated  within  for  any  e  >  0,  unless 

NP  =  ZPP.  In  the  same  paper,  a  9/8- approximation  algorithm  was  pre¬ 
sented  for  bipartite  graphs.  The  hardness  question  for  this  problem  on 
bipartite  graphs  was  left  open.  In  this  paper  we  show  that  the  minimum 
color  sum  problem  for  bipartite  graphs  admits  no  polynomial  approxima¬ 
tion  scheme,  unless  P  =  NP.  The  proof  is  by  L-reducing  the  problem  of 
finding  the  maximum  independent  set  in  a  graph  whose  maximum  degree 
is  four  to  this  problem.  This  result  indicates  clearly  that  the  minimum 
color  sum  problem  is  much  harder  than  the  traditional  coloring  problem 
which  is  trivially  solvable  in  bipartite  graphs.  As  for  the  approximation 
ratio,  we  make  a  further  step  towards  finding  the  precise  threshold.  We 
present  a  polynomial  10/9-approximation  algorithm.  Our  algorithm  uses 
a  flow  procedure  in  addition  to  the  maximum  independent  set  procedure 
used  in  previous  results. 


1  Introduction 

One  of  the  most  fundamental  problems  in  scheduling  theory  is  scheduling 
efficiently  (under  some  optimization  goals)  dependent  tasks  on  a  single 
machine.  At  any  given  time,  the  machine  is  capable  to  perform  (serve) 
any  number  of  tasks  as  long  as  these  tasks  are  independent.  When  the 
serving  time  of  each  task  is  the  same,  this  problem  is  identical  to  the  well 
known  coloring  problem  of  graphs.  The  vertices  of  the  graph  represent  the 
tasks  and  an  edge  in  the  graph  between  vertices  v  and  u  represents  the 
dependency  between  the  two  corresponding  tasks.  That  is,  the  machine 

*The  full  version  of  this  extended  abstract  can  be  found  in  URL 
http: / /www. eng. tau.ac.il/  amotz/ publications.html. 

**Department  of  Electrical  Engineering,  Tel-Aviv  University,  Tel- Aviv  69978,  Israel. 
E-mail:  ainotz@eng.tau.ac.il. 

*** Department  of  Computer  Science,  The  Open  University  of  Israel,  Ramat  Aviv, 
Israel.  E-mail:  guyk@tavor.openu.ac.il. 
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cannot  perform  the  tasks  corresponding  to  vertices  u  and  v  concurrently. 
Another  important  application  arises  in  the  context  of  distributed  re¬ 
source  allocation.  Here,  the  vertices  represent  processors  each  has  one 
job  to  execute.  An  edge  between  two  vertices  indicates  that  the  jobs  be¬ 
longing  to  the  corresponding  processors  cannot  be  executed  concurrently 
since  they  require  the  usage  of  the  same  common  resource.  This  problem 
is  known  in  the  literature  as  the  dining  (drinking)  philosophers  problem 
([LYN81,  CM84]). 

More  formally,  the  coloring  problem  can  be  defined  as  follows,  let  G  = 
{V,E)  be  an  undirected  simple  graph  with  n  vertices,  where  V  denotes 
the  set  of  n  vertices  and  E  denotes  the  set  of  edges.  A  coloring  of  the 
vertices  of  (7  is  a  mapping  into  the  set  of  positive  integers,  f  :  V  ^ 
such  that  adjacent  vertices  are  assigned  different  colors.  We  refer  to  f{v) 
as  the  color  of  v. 

The  traditional  optimization  goal  is  to  minimize  the  number  of  dif¬ 
ferent  assigned  colors.  We  call  this  problem  the  minimum  coloring  (MC) 
problem.  In  the  setting  of  tasks  system,  this  is  equivalent  to  finding  a 
schedule  in  which  the  machine  finishes  performing  all  the  tasks  as  early 
as  possible.  In  the  setting  of  resource  allocation,  this  is  equivalent  to  find¬ 
ing  a  schedule  in  which  the  last  processor  finishes  executing  its  job  the 
earliest.  This  is  an  optimization  goal  that  favors  the  system.  However, 
from  the  point  of  view  of  the  tasks  (or  processors)  themselves,  we  might 
wish  to  find  the  best  coloring  such  that  the  average  waiting  time  to  be 
served  (or  to  execute  the  job)  is  minimized. 

Clearly,  minimizing  the  average  waiting  time  is  equivalent  to  minimiz¬ 
ing  the  sum  of  all  assigned  colors.  The  minimum  color  sum  (MCS)  problem 
is  defined  as  follows.  Let  G  =  {V,  E)  be  an  undirected  simple  graph  with 
n  vertices.  We  are  looking  for  a  coloring  in  which  the  sum  of  the  assigned 
colors  of  all  the  vertices  of  G  is  minimized.  That  is,  the  value  of  J2vev  f  M 
is  minimized. 

The  minimum  color  sum  problem  was  introduced  by  Kubicka  in  [K89]. 
In  [KS89]  it  was  shown  that  computing  the  MCS  of  a  given  graph  is  NP- 
hard.  A  polynomial  time  algorithm  was  given  for  the  case  where  G  is  a 
tree.  In  [KKK89]  it  was  shown  that  approximating  the  MCS  problem  within 
an  additive  constant  factor  is  NP-hard.  In  a  recent  paper,  [BBH+96],  it 
was  proven  that  the  MCS  problem  cannot  be  approximated  within 
for  any  e  >  0,  unless  NP  =  ZPP.  On  the  other  hand,  this  paper  showed 
that  an  algorithm  based  on  finding  iteratively  a  maximum  independent 
set  is  a  4-approximation  to  the  MCS  problem.  This  bound  yields  a  dp- 
approximation  polynomial  algorithm  for  the  MCS  problem  for  classes  of 
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graphs  for  which  the  maximum  independent  set  problem  can  be  polyno- 
mially  approximated  within  a  factor  of  p. 

A  special  and  important  sub-class  of  graphs  is  the  class  of  bipartite 
graphs.  In  a  bipartite  graph  the  set  of  vertices  V  is  partitioned  into  two 
disjoint  sets  VJ  and  Vr  such  that  both  sets  are  independent.  That  is,  all 
the  edges  of  E  are  between  vertices  of  Vi  and  14 .  Coloring  VJ  by  1  and 
Vr  by  2  yields  a  2-coloring  of  any  bipartite  graph.  Obviously  this  is  the 
best  possible  solution  for  the  MC  problem.  However,  for  the  MCS  problem 
the  answer  is  not  straightforward.  Denote  by  MBCS  the  MCS  problem  on 
bipartite  graphs. 

Coloring  the  largest  set  between  Vi  and  Vr  by  1  and  the  other  set  by 
2  yields  a  solution  to  the  MBCS  problem  the  value  of  which  is  at  most 
3n/2.  Obviously  the  value  of  the  optimal  solution  is  at  least  n,  and  there¬ 
fore  this  solution  is  at  least  a  3/ 2-approximation  to  the  optimal  solution. 
The  paper  [BBH+96]  presents  a  better  approximation  of  9/8  using  as  a 
sub-procedure  the  algorithm  for  finding  a  maximum  independent  set.  In 
bipartite  graphs,  finding  maximum  independent  set  can  be  done  in  poly¬ 
nomial  time.  Therefore,  their  approximation  algorithm  is  also  polynomial. 

New  results:  The  contributions  of  this  paper  are  the  following  two  results: 

—  We  prove  the  first  hardness  result  for  MBCS.  We  show  that  the  MBCS 
problem  admits  no  polynomial  approximation  scheme,  unless  P  = 
NP.  The  proof  is  by  L-reducing  the  problem  of  finding  the  maximum 
independent  set  in  a  graph  whose  maximum  degree  is  four  to  the 
MBCS  problem  which  implies  that  MBCS  is  MAXSNP-hard  [PY88].  This 
result  indicates  clearly  that  the  MCS  problem  is  much  harder  than  the 
traditional  coloring  problem. 

-  We  improve  the  approximation  ratio  for  the  MBCS  problem  by  present¬ 
ing  a  10/ 9-approximation  algorithm.  Our  algorithm  introduces  a  new 
technique.  It  employs  a  flow  procedure  in  addition  to  the  maximum 
independent  set  procedure  used  in  [BBH^96]. 

Max-type  vs.  sum-type  problems:  Our  impossibility  result  raises  the  gen¬ 
eral  question  of  the  connection  between  “max-type”  and  “sum-type” 
problems.  The  MC  problem  is  a  max-type  problem  whereas  the  MCS  prob¬ 
lem  is  a  sum-type  problem.  The  input  and  the  feasible  solutions  for  both 
problems  are  the  same,  the  difiFerence  lies  in  the  optimization  goal.  In  the 
full  version  of  this  paper  ([BK97])  we  examine  the  “max-type”  and  the 
“sum-type”  of  the  Traveling  Salesperson  problem  (TSP).  The  discussion 
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there  raises  the  interesting  question  of  classifying  problems  according  to 
the  relationship  between  their  “max-type”  version  with  the  “sum-type” 
version.  The  coloring  problem  and  the  traveling  salesperson  problem  each 
belongs  to  a  different  class. 

2  Preliminaries 

Given  a  graph  G{V,  E)  we  use  the  following  notations.  Let  MIS(G)  denote 
the  largest  independent  set  in  G.  For  any  set  S  QV,  let  N{S)  be  the  set 
of  neighbors  of  S  and  MIS (5)  denotes  the  maximum  independent  set  in 
the  graph  induced  by  S.  We  also  use  the  term  5  to  denote  the  size  of  S. 
Given  any  coloring  /  of  a  graph,  we  denote  by  SC{/)  the  sum  of  colors 
in  /,  i.e.,  SC(/)  =  YlvevfM-  When  all  the  vertices  in  a  set  5  C  y  are 
colored  by  the  same  color  c,  we  say  that  S  is  colored  by  c. 

We  say  that  problem  P  admits  a  polynomial  approximation  scheme, 
if  for  any  e  >  0  there  exists  a  polynomial  time  approximation  algorithm 
for  P,  whose  approximation  ratio  is  bounded  by  (1  +  e). 

L-reduction  The  L-reduction  ([PY88])  is  a  tool  that  helps  proving  hard¬ 
ness  results.  Unlike  the  usual  iVP-hardness  reductions,  it  “preserves”  ap¬ 
proximation  ratios.  In  order  to  define  L-reduction  we  need  the  following 
notations.  Let  P  be  an  optimization  (either  minimization  or  maximiza¬ 
tion)  problem.  Denote  by  I{P)  the  set  of  instances  for  problem  P,  by 
sol{P)  the  set  of  feasible  solutions  of  problem  P,  and  by  cp(s)  the  cost 
function  of  any  feasible  solution  s  for  P.  Suppose  now  that  P  and  Q  are 
two  optimization  problems.  In  order  to  construct  an  L— reduction  we  need 
to  define  two  (polynomially  computable)  functions  IZ  :  I{P)  -f(Q)  ^nd 
S  :  sol{Q)  I— sol{P).  For  any  instance  x  G  I{P)  let  copt{^)  be  the  value 
of  the  optimal  solution  for  x  and  let  copt{'R'{x))  be  the  value  of  the  op¬ 
timal  solution  for  7Z{x).  The  two  functions  TZ  and  S  are  an  L— reduction 
from  problem  P  to  problem  Q,  if  there  exist  two  constants  a  and  jS  such 
that  the  two  following  properties  hold: 

1.  copriE^i^))  ^  ^  '  copt(^)- 

2.  For  any  feasible  solution  s  G  sol{Q)  of  7Z{x),  S{s)  is  a  feasible  solution 
for  X  and  \copt{x)  -  cp{S{s))\  <  (5  •  \copt{'^{x))  -  cq(s)|. 

Theorem  1  [PY88].  Suppose  that  Problem  P  admits  no  polynomial  ap¬ 
proximation  scheme  and  that  Problem  P  can  be  L— reduced  to  problem  Q. 
Then  Problem  Q  admits  no  polynomial  approximation  scheme. 
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The  MIS  and  4-MIS  problems  The  Maximum  Independent  Set  (MIS)  prob¬ 
lem  is  the  following.  Given  an  undirected  graph  G{V,E)  with  n  vertices, 
the  goal  is  to  find  a  maximum  independent  set.  I.e.,  a  maximum  sized  set 
S  CV  such  that  no  two  vertices  of  S  share  an  edge.  The  4-MIS  problem 
is  the  MIS  problem  restricted  to  graphs  with  maximum  degree  4. 

Theorem 2  [ALM+92].  There  exists  some  e  >  0  such  that  the  4-MIS 
admits  no  (1  -h  e) -approximation  algorithm,  unless  P  =  NP  (and  hence 
4-MIS  admits  no  polynomial  approximation  scheme). 

Known  algorithms  for  the  MBCS  problem  We  recall  the  approximation  al¬ 
gorithm  presented  in  [BBH+96].  For  a  given  bipartite  graph  G,  denote  by 
Ii  the  maximum  independent  set  in  G,  by  h  the  maximum  independent 
set  in  G  \  /i,  by  h  the  maximum  independent  set  in  G  \  (/i  U  /2),  and  so 
on.  The  algorithm  of  [BBH+96]  is  best  explained  by  the  definition  of  a  se¬ 
quence  of  (roughly)  logn  possible  algorithms.  Let  A (2)  be  the  algorithm 
that  colors  the  vertices  of  G  with  two  colors,  the  larger  side  of  F  by  1 
and  the  smaller  side  by  2.  Let  A (3)  be  the  following  algorithm:  color  the 
vertices  of  h  by  1,  and  then  color  the  vertices  of  G  \  /i  by  2  and  3  (i.e., 
color  the  larger  side  in  the  remaining  graph  by  2  and  the  smaller  side  by 
3).  In  general,  for  i  >  3  and  for  1  <  i  <  «  -  2,  algorithm  A(i)  colors  the 
sets  Ij  with  color  j,  and  then  colors  the  larger  side  of  the  remaining  graph 
by  i  —  1  and  the  smaller  side  by  i.  All  together,  algorithm  A(i)  uses  i 
colors.  Note  that  we  have  defined  at  most  [lognj  algorithms,  because  the 
maximum  independent  set  in  any  bipartite  graph  with  n  vertices  contains 
at  least  n/2  vertices.  Let  A'  be  the  last  possible  algorithm  in  this  family  of 
algorithms.  Since  G  is  a  bipartite  graph,  it  follows  that  h  >  n/2.  There¬ 
fore,  algorithm  A (2)  is  a  3/ 2-approximation  algorithm.  Consider  now  the 
following  algorithm,  denoted  by  B,  that  runs  algorithms  A (2)  and  A (3) 
and  picks  the  best  solution. 

Theorems  [BBH+96].  Algorithm  B  is  a  approximation  algorithm 
to  the  MBCS  problem. 

An  algorithmic  tool  We  now  describe  the  new  tool  used  in  our  approxi¬ 
mation  algorithm.  Define  the  2 -Neighborhood  problem  as  follows.  Given 
a  bipartite  graph  G{Vi,Vr,  E)  we  look  for  a  set  S  Q  Vi  such  that  ds  ~ 
2S  —  N {S)  is  maximum.  We  note  that  the  order  in  which  Vi  and  Vp  are 
specified  in  the  problem-presentation  is  important,  that  is  the  solution  S 
is  a  subset  of  Vi.  Polynomial  time  solutions  for  problems  of  this  nature 
are  known  (see,  e.g.,  [GGT89]). 
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3  A  hardness  result  for  the  MBCS  problem 

In  this  section,  we  prove  that  (unless  P  —  NP)  the  MBCS  problem  has  no 
polynomial  approximation  scheme.  We  do  that  by  proving  an  L— reduction 
from  the  4-MIS  problem  to  the  MBCS  problem  (hence  showing  that  the 
MBCS  problem  is  MAXSNP— hard).  By  Theorems  1  and  2  the  hardness  re¬ 
sult  is  implied. 

3.1  The  construction  -  the  function  IZ 

Let  G{V^  E)  be  an  instance  of  the  4-MIS  problem.  The  IZ  function  should 
map  G  into  a  graph  G  which  is  an  instance  of  the  MBCS  problem.  First, 
G  contains  a  vertex  corresponding  to  each  vertex  in  V.  In  G^  V  is  an 
independent  set.  We  assume  an  order  on  the  vertices  of  G.  Whenever  we 
consider  an  edge  {x,y)  E  E  we  assume  that  x  <  y.  The  construction 
involves  adding  a  gadget  for  each  edge  e  =  [x^y)  G  E.  Each  gadget  is 
composed  of  twelve  independent  sets  of  vertices  containing  no  internal 
edges  (edges  only  cross  from  one  different  set  to  the  other).  The  sets  of 
vertices  corresponding  to  different  edges  are  disjoint 

Before  describing  the  sets  of  vertices  and  the  edges  of  any  gadget  we 
need  some  definitions.  We  say  that  two  (independent)  sets  A  and  B  are 
cliqued^  if  every  vertex  in  A  is  connected  to  every  vertex  in  B  that  is,  the 
sets  A  and  B  induce  a  complete  bipartite  graph.  We  say  that  the  two  sets 
are  matched  if  |A|  —  \B\  and  every  vertex  x  in  A  has  a  single  neighbor 
m{x)  in  5,  that  is,  the  sets  A  and  B  induce  a  perfect  matching.  The 
sets  and  edges  in  the  gadget  corresponding  to  the  edge  e  =  {x^y)  are  as 
follows. 

Main  and  matched  sets: 

1.  A  set  XYX  of  3  vertices  and  a  matched  set  m{XYX)  of  3  vertices. 

2.  A  set  XYY  of  3  vertices  and  a  matched  set  m{XYY)  of  3  vertices. 

3.  A  set  XY  of  6  vertices  and  a  matched  set  m(XY)  of  6  vertices. 
Imposing  sets: 

1.  A  set  Ii{XYX)  of  18  vertices  and  a  cliqued  set  l2{XYX)  of  9  vertices. 

2.  A  set  Ii{m{XY X))  of  6  vertices  and  a  cliqued  set  l2{m{XYX))  of  3 
vertices. 


3.  Two  sets  Ii{XY)  of  24  vertices  and  Ii{m{XY))  of  12  vertices. 
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Additional  edges  between  the  sets: 

1.  The  vertex  x  (y)  is  connected  to  all  3  vertices  of  XYX  (XYY). 

2.  The  sets  XYX  and  XYY  each  is  cliqued  with  XY . 

3.  The  sets  XFX  {m{XYX))  a.ndl2iXYX)  {l2{m{XYX)))  are  cliqued. 

4.  The  sets  XY  (m(XY))  and  Ii(XY)  (Ii(m(XY))}  are  cliqued. 

This  completes  the  description  of  the  gadget  corresponding  to  each 
edge  e  =  {x,y)  and  the  description  of  the  72,-function.  The  above  sets 
depend  on  e,  that  is,  there  is  such  a  gadget  for  every  edge  e  6  E.  We 
avoid  adding  e  as  a  subscript  in  these  sets,  for  the  simplicity  of  notation. 
In  order  for  the  72  function  to  be  valid  we  demonstrate  a  2  coloring  for  G 
proving  that  the  graph  (5  is  a  bipartite  graph. 

Lemma  4.  The  graph  G  is  bipartite. 

The  intuition  behind  the  construction:  The  goal  of  the  construction  is  to 
enable  us  to  define  the  right  function  S.  The  role  of  the  imposing  sets  is  to 
force  a  situation  in  which  some  sets  cannot  be  colored  by  a  specific  color. 
For  example,  it  will  be  shown  that  in  an  optimal  coloring  the  imposing  set 
hiXYX)  is  colored  by  2.  Consequently,  the  set  XYX  cannot  be  colored 
by  2.  In  general,  in  an  optimal  solution,  all  the  sets  of  type  h  are  colored 
by  1  and  all  the  sets  of  type  I2  are  colored  by  2.  The  role  of  the  matched 
sets  is  to  assure  that  the  sum  coloring  of  two  matched  sets  is  fixed  in 
any  optimal  coloring.  For  example,  if  a  vertex  in  XYX  is  colored  by  1, 
then  its  matched  vertex  is  colored  by  3,  and  vice  versa  (recalling  that 
these  two  sets  can  not  be  colored  by  2).  Thus  every  pair  in  XYX  and 
m{XYX)  adds  exactly  4  to  the  sum  coloring  in  an  optimal  coloring  and 
the  contribution  of  XYX  and  m{XYX)  is  fixed.  Now  let  us  explain  the 
main  idea  in  the  construction.  Let  x  and  y  be  two  vertices  adjacent  in  G 
(i.e.,  {x,y)  e  E),  We  will  show  that  we  lose  in  the  sum  coloring  if  both  x 
and  y  are  colored  by  1.  Indeed,  say  that  both  x  and  y  are  colored  1,  and 
consider  the  colors  of  XT,  XYX^  XYY.  In  the  best  coloring  XYX  is 
colored  by  3  and  XYY  by  2.  Therefore,  since  the  set  Ii{XY)  is  colored 
by  1,  it  follows  that  XY  is  colored  by  at  least  4.  On  the  other  hand,  if 
one  of  X  and  y  is  not  colored  by  1,  we  may  gain  by  assigning  XY  a  color 
less  then  4.  This  follows  since  XYX  and  XYY  will  “waste”  only  one  of 
the  colors  2  and  3.  Hence,  it  is  possible  to  color  XY  with  either  2  or  3. 
Therefore,  a  “good”  sum  coloring  colors  as  large  as  possible  independent 
set  in  G  by  1.  Thus,  a  “good”  approximation  for  the  MBCS  problem  implies 
a  “good”  approximation  for  the  4-MIS  problem. 
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3.2  The  function  S 

A  coloring  /  of  the  vertices  in  G  is  proper^  if  the  two  following  properties 
hold  for  every  edge. 

Imposing  properties:  The  sets  h{XYX),  h(rn[XYX)),  h{XY),  and 
Ii{m{XY))  are  colored  by  1.  The  sets  l2{XYX)  and  l2{m{XYX)) 
are  colored  by  2. 

Independence  property:  All  the  vertices  of  G  that  are  colored  by  1  in 
/  form  an  independent  set  in  G. 

The  process  of  constructing  S  is  as  follows.  We  start  with  any  feasible 
coloring  f  of  G.  We  then  show  in  five  stages  that  /  can  be  transformed  to  a 
proper  coloring  /  such  that  the  sum  of  colors  in  /  is  no  larger  than  the  sum 
of  colors  in  /  (SC(/)  <  SC(/)).  The  mapping  S  is  now  defined  by  choosing 
the  set  of  vertices  in  G  that  are  colored  by  1  by  /  denoted  by  /i(/).  Note, 
that  by  the  independence  property,  /i(/)  is  also  an  independent  set  in  G. 

In  the  first  stage  we  transform  /  into  fi  such  that  all  the  vertices  in 
any  independent  set  in  any  gadget  are  colored  by  the  same  color.  In  the 
second  stage,  we  transform  fi  into  a  coloring  /2  that  is  locally  minimal, 
that  is  a  coloring  such  that  each  set  in  the  gadget  is  colored  by  no  more 
than  A;  +  I  where  k  is  the  number  of  neighboring  sets  to  this  set.  In  the 
third  stage,  we  show  how  to  transform  /2  into  a  coloring  fs  such  that 
the  imposing  properties  hold.  In  the  forth  stage,  we  transform  fs  into  a 
coloring  in  which  all  the  sets  XYX  and  XYY  in  all  the  gadgets  are 
colored  by  no  more  than  3.  Finally,  in  the  fifth  stage  we  transform  fs 
into  the  desired  coloring  /  by  showing  how  to  achieve  the  independence 
property.  In  all  five  stages  the  new  coloring  has  no  worse  sum  coloring 
then  the  previous  one.  The  full  proof  appears  in  [BK97]. 

3.3  The  L— reduction  properties 

We  now  turn  to  prove  the  two  L— reduction  properties.  Let  OPT  be  the 
minimum  sum  coloring  in  G  and  let  MIC  =  SC(OPT).  The  next  lemma 
proves  the  first  property  of  the  L-reduction. 

Lemma  5.  There  exists  a  constant  a  such  that  MIC  <  a  ■  MIS(G). 

For  the  second  property  of  the  L-reduction,  we  need  to  show  the  exis¬ 
tence  of  a  constant  f3  such  that  for  any  legal  coloring  /  of  (5  the  following 
holds:  MIS(G')  -  S{f)  <  /3(SC(/)  -  MIC).  We  prove  this  inequality  with 
jS  =  1.  The  proof  uses  the  following  two  lemmas.  Let  h  be  the  maximum 
independent  set  in  G. 
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Lemma  6.  MIC  <  135  •  +  2n  —  /i . 

Now  let  /  be  an  arbitrary  coloring  of  G  and  let  /  be  its  corresponding 
proper  coloring.  Let  Ii{f)  be  the  set  of  vertices  colored  by  1  in  /,  and 
thus  S{f)  = 

Lemma?.  SC(/)  >  135  •  +  2n  — /i(/). 

The  following  lemma  states  the  second  property  of  the  L-reduction. 

Lemmas.  MIS(G)  -  S{f)  <  SC(/)  -  MIC. 

We  completed  constructing  a  valid  L-reduction  from  the  4-MIS  prob¬ 
lem  to  the  MBCS  problem.  The  following  theorem  follows  from  Theorems 
1  and  2. 

Theorem  9.  There  exists  an  e  >  0  such  that  there  is  no  (1  +  e)  — ratio 
approximation  algorithm  for  the  MBCS  problem  unless  P  =  NP. 

4  Improved  approximation  algorithm  for  MBCS 

In  the  previous  section  we  have  shown  that  there  exists  some  e  >  0  such 
that  the  MBCS  problem  has  no  (1  +  e)-approximation  algorithm.  How¬ 
ever,  the  precise  threshold  for  the  approximation  is  yet  to  be  determined. 
We  take  a  further  step  in  this  direction.  In  this  section,  we  present  a 
new  algorithm  C  that  utilizes  a  new  procedure  Neig.  We  prove  that  this 
procedure,  combined  with  algorithms  A(2),  A(3),  and  A (4)  yield  a  10/9- 
approximation  algorithm  for  the  MBCS  problem. 

4.1  Procedure  Neig  and  Algorithm  C 

Procedure  Neig  utilizes  the  solution  to  the  2-Neighborhood  problem.  It 
uses  the  following  subsets  and  subgraphs  of  G, 

1.  /i  -  the  maximum  independent  set  in  G.  l{  =  /i  fl  Vj  and  If  =  /i  H  V^. 

2.  Z  -  the  larger  side  oiG\Ii  and  W  -  the  smaller  side  oiG\Ii.  Without 
loss  of  generality,  assume  that  Z  cVi  and  W  C  Vr- 

3.  Gz  —  {Z,Ii,Ez)  -  the  (bipartite)  subgraph  induced  by  Z  and  /[. 

Gw  =  {W,Ii^Ew)  -  the  (bipartite)  subgraph  induced  by  W  and  /{. 

4.  Sz  -  the  set  maximizing  dsz  ~  ‘^^z  —  E{Sz)  in  Gz- 
Sw  -  the  set  maximizing  ds^  =  2Sw  ~  ^{Sw)  in  Gw- 

5.  Ni{Sz)  -  iV(5z)  n/[  and  Ni{Sw)  =  N{Sw)rM[. 
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Procedure  Neig; 

If  dsz  >  ds^  then  color:  If  dsz  <  dsw  color: 

1.  /}  U  U  [II  \  iVi(S'z))  by  1.  1.  /[  U  U  (/(  \  JVi(5wa))  by  1. 

2.  l¥u7Vi(5z)  by  2.  2.  Z  U  JVi(5h/)  by  2. 

3.  Z  \  Sz  by  3.  3.  1^  \  Sw  by  3. 

For  the  case  dsz  >  dsy^,,  procedure  Neig  can  be  described  as  follows. 
Start  with  the  initial  coloring  of  A  (3),  that  is  Ii  is  colored  by  1,  Z  (the 
larger  of  the  two  remaining  sides)  is  colored  by  2  and  FF  by  3.  Thus 
SC(A(3))  —  /{+/[  +  2Z  +  3W.  Next,  re-color  Z  by  3  and  W  by  2,  losing 
Z  —  W  in  the  sum  coloring.  Next,  change  the  color  of  from  3  to  1 
gaining  2Sz  in  the  sum  coloring.  This  forces  all  the  neighbors  of  Z  in 
/i,  Ni{Sz),  to  be  colored  by  a  color  different  than  1,  thus  color  them  by 
2.  Here  we  lose  Ni{Sz)  in  the  sum  coloring.  The  net  profit  in  the  sum 
coloring  is  therefore  2Sz  —  Ni{Sz)  W  —  Z  =  dsz  +  VF  —  Z.  Similarly, 
it  can  be  shown  that  for  the  case  dsy^  >  dsz ,  the  net  profit  is  ds^  •  (This 
case  is  better  for  us  since  we  do  not  need  to  switch  the  colors  of  Z  and 
FF,  loosing  Z  —  W .)  Thus,  we  proved  the  following  proposition. 

Proposition  10. 

(1) .  Ifdsz'^dsy^  then  SC{Ueig)  =  SC{ki3))  -  dsz  P  {Z  -  W). 

(2) .  If  ds^  >  dsz  then  SC(Neig)  =  SC(A(3))  -  . 

We  conclude  this  subsection  with  the  description  of  algorithm  C.  It 
clearly  follows  that  the  algorithm  has  a  polynomial  running  time. 

Algorithm  C 

—  Run  algorithms  A (2),  A (3),  A (4),  and  Procedure  Neig. 

-  Pick  the  solution  whose  sum  coloring  is  the  minimum  among  the  four 
coloring  solutions. 

4.2  Analysis 

All  through  the  analysis,  let  Z  =  (n— /i)/2+€(^n  and  W  —  (n— /i)/2  — e^^n. 
The  term  e^n  quantifies  the  extent  in  which  the  graph  induced  by  Z  U  FF 
is  unbalanced.  This  is  the  graph  resulting  once  the  maximum  independent 
set  /i  is  deleted  from  G. 
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Outline  of  the  analysis:  U  Z  -W  =  2edn  is  “large”  enough,  then  the 
10/9-ratio  is  already  yielded  by  min  {SC(A(2)),  SC(A(3))}.  Otherwise,  Z- 
W  is  not  too  “large”.  If  I2  is  “large”  enough,  then  this  time  already 
min{SC{A(2)),SC(A(4))}  yields  the  10/9-ratio.  Otherwise,  W  is  almost 
as  “large”  as  Z  and  I2  is  not  too  “large” .  If  is  small  enough  and 
therefore  Z  is  also  “small”  and  /i  is  “large”  enough,  then  SC(A(3))  alone 
yields  the  10/9-ratio.  Otherwise  Z-PT  and  h  are  not  too  “large”  and  W 
is  not  too  “small” .  If  the  optimal  algorithm  does  not  deviate  much  from 
algorithm  A  (3),  then  again  min{SC(A(2)),  SC(A(3))}  yields  the  10/9- 
ratio.  Finally,  if  all  the  previous  conditions  do  not  hold,  we  use  the  new 
procedure  Neig  and  show  that  min{SC(A(2)),  SC(Neig)}  yields  the  10/9- 
ratio.  The  analysis  is  partitioned  into  the  above  five  cases.  The  complete 
analysis  appears  in  [BK97]. 

Theorem  11.  AlgovithmC  is  a  polynomial  approximation  algorithm 
for  the  MBCS  problem. 
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Abstract.  This  paper  is  concerned  with  the  polynomial  time  appro.x- 
imability  of  node-deletion  problems  for  hereditary  properties. 

We  will  focus  oil  such  graph  properties  that  are  derived  from  matroids 
definable  on  the  edge  set  of  any  graph.  It  will  be  shown  first  that  all  the 
node-deletion  problem  for  such  properties  can  be  uniformly  formulated 
by  a  simple  but  non-standard  form  of  the  integer  program.  A  primal- 
dual  approximation  algorithm  based  on  this  and  the  dual  of  its  linear 
relaxation  is  then  presented. 

When  a  property  has  infinitely  many  minimal  forbidden  graphs  no  con¬ 
stant  factor  approximation  for  the  corresponding  node-deletion  problem 
has  been  known  except  for  the  case  of  the  Feedback  Vertex  Set  (FVS) 
problem  in  undirected  graphs.  It  will  be  shown  next  that  FVS  is  not  the 
sole  exceptional  case  and  that  there  exist  infinitely  many  graph  (heredi¬ 
tary)  properties  with  an  infinite  number  of  minimal  forbidden  graphs,  for 
which  the  node-deletion  problems  are  efficiently  approximable  to  within 
a  factor  of  2.  Such  properties  are  derived  from  the  notion  of  matroidal 
famihes  of  graphs  and  relaxing  the  definitions  for  them. 


1  Introduction 

This  paper  is  concerned  with  the  polynomial  time  approximahility  of  node¬ 
deletion  i)roblems  for  hereditary  properties.  The  node-deletion  problem  for  a 
graph,  property  tt  (denoted  ND(7r)  throughout  the  paper)  is  a  typical  graph  opti¬ 
mization  problem;  that  is,  given  a  node-weighted  graph  find  a  node  set  of  the 
minimum  weight  sum  s.t.  deletion  of  it  (along  wdth  all  the  incident  edges)  from 
G  leaves  a  subgraph  satisfying  the  property  tt.  A  graph  property  tt  is  hered¬ 
itary  if  every  subgraph  of  a  graph  satisfying  tt  also  satisfies  tt.  A  number  of 
well-studied  graph  properties  are  hereditary  such  as  independent  set,  planar, 

*  This  work  is  partially  supported  by  a  grant  from  the  Okawa  Foundation  for  Infor- 
mati(5ii  and  Telecommunications. 
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bipartire,  degree-constrained,  circular-arc,  circle  graph,  chordal,  comparal:)!!- 
itv,  permutation,  perfect.  Consequently,  many  well  known  graph  pioblems  fall 
iii^to  this  class  of  problems  when  desired  graph  properties  are  specified  appropri¬ 
ately.  Lewis  and  Yannakakis  proved,  however,  that  whenever  tt  is  nontrivial  and 
heredirary  on  induced  subgraphs  ND(7r)  is  AT>-hard  [LYSO],  When  this  general 
YP-hardness  result  was  established  in  1980,  almost  nothing  was  known  about 
the  approximability  of  XD(7r)’s  except  for  good  approximation  algorithms  for 
the  Vertex  Cover  (VC)  problem  (i.e.,  tt  =  “independent  set'’).  Moreover,  their 
generic  reductions  from  VC  to  other  ND(7r)’s  are  approximation  preserving,  and 
as  such,  no  ND(7r)  can  be  approximated  better  than  VC  can  be.  One  question 
posed  therein  was  thus  concerned  with  the  other  direction  of  approximability: 
Can  other  node^deletion  problems  be  approximated  as  good  as  VC  can  be  ? 

It  has  been  long  known  that  VC  can  be  approximated  with  ratio  2  (achievable 
by  a  simple  maximal  matching  heuristic  [Gav74]  for  the  imweighted  case)  and 
a  better  approximation  has  been  a  subject  of  extensive  research  over  the  years. 
Yet  the  best  constant  bound  has  remained  the  same  at  2  while  the  best  known 
heuristics  can  accomplish  only  slightly  better  (2  -  [BE85,  MS85]).  On 

the  other  hand  very  few  other  ND(7r)\s  have  been  shown  to  be  approximable 
within  a  factor  of  c,  for  any  constant  c,  not  to  mention  a  constant  of  2.  As 
observed  in  [LY93]  whenever  hereditary  tt  has  only  a  finite  number  of  minimal 
forbidden  graphs  ND(7r)  can  be  efficiently  approximated  to  within  some  con¬ 
stant  factor  of  the  optimum.  It  was  in  fact  conjectured  therein  that  those  with 
finitely  many  minimal  forbidden  graphs  ai’e  the  only  hereditary  properties  which 
yield  constant  factor  approximable  node-deletion  problems  (see  also  [Yan94l).  It 
was  found  later,  however,  that  this  conjecture  does  not  hold  as  is  when  the  (un¬ 
weighted)  Feedback  Vertex  Set  (FVS)  problem  (i.e.,  tt  =  “acyclic”)  in  undirected 
graphs  was  shown  to  be  approximable  to  within  a  factor  of  4  [BGXR94]  (Note: 
every  simple  cycle  of  each  length  is  a  minimal  forbidden  graph  for  this  tt).  Until 
now  this  problem  has  been  the  only  known  exception  to  the  Lund-Yannakakis’ 
conjecture. 

1.1  Our  results 

In  this  paper  we  will  show  that  there  exist  infinitely  many  XD(7r)\s  for  tt  with  an 
infinite  number  of  minimal  forbidden  graphs,  each  of  which  approximable  to  a 
factor  of  2.  For  that  purpose  we  shall  concentrate  on  such  hereditary  properties 
that  can  be  derived  from  (independent  sets  of)  matroids  definable  on  the  edge 
set  of  any  graph  (details  given  later).  The  class  of  ND(7r)’s  for  such  properties 
includes  VC,  FVS,  and  many  others.  It  will  be  shown  first  that  all  ND(;t)’s  in 
this  class  can  be  uniformly  formulated  by  a  simple  but  non-stai:idard  form  of 
the  integer  program  using  matroid  rank  functions.  A  primal-dual  approximation 
algorithm  for  such  ND(;r)’s  is  then  designed  based  on  this  formulation  and  the 
dual  of  its  linear  programming  relaxation,  which  is  simpler  than  those  algorithms 
for  FVS  given  in  [BBF95,  BG94,  CGHW96].  In  particular  our  algorithm  does 
not  look  into  nor  modify  explicitly,  unlike  the  previous  algorithms  for  FV'S,  any 
special  structure  in  graphs  under  consideration.  An  analysis  of  this  algorithm 
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reveals  that  its  performance  ratio  can  be  reduced  to  the  combinatorial  bound 
arising  from  the  underlying  structures  of  the  problems. 

It  will  be  shown  next,  as  an  application  of  the  current  primal-dual  approach, 
that  FVS  is  not  the  sole  exceptional  case;  i.e.,  there  exist  other  (hereditary) 
properties  tt's  with  an  infinite  number  of  minimal  forbidden  graphs,  s.t.  XD(7r)’s 
are  efficiently  approximable  to  within  a  factor  of  2,  the  best  constant  factor 
known  for  either  VC  or  FVS.  In  fact,  we  will  show,  there  are  infinitely  many 
of  them  (at  least  countably  many).  Such  properties  are  derived  from  the  notion 
of  matroidal  families  of  graphs  and  relaxing  the  definitions  for  them  (details 
later).  The  infinite  sequence  of  these  properties  will  be  constructed  having  those 
for  both  VC  and  FVS  at  its  basis  and  thus  providing  a  proper  generalization 
of  them.  It  is  also  worth  pointing  out  that  our  formulation  for  these  XD(7r)’s 
introduces  the  integrality  gap  of  at  most  2  unlike  the  more  natural  “covering” 
formulations  for  them. 

1.2  Other  related  work 

Every  XD(7r)  for  nontrivial  hereditary  tt  is  MAX  SXP-hard,  as  pointed  out 
in  [LY93],  due  to  the  reductions  of  [LY80]  and  the  result  of  [PY91].  Thus,  no 
polynomial  time  algorithm  can  approximate  ND(7r)  to  within  a  factor  of  1  +  6  for 
some  positive  f.  unless  P  =  NP  [ALM'^92].  Yet  a  better  lower  bound  is  provided 
by  the  one  in  approximation  of  VC  as  it  serves  as  a  lower  bound  for  every  XD(7r) 
for  hereditai-y  tt.  Such  a  bound  for  VC  has  been  continuously  improved  in  the 
last  few  yeai's,  and  currently  it  is  known  to  be  as  large  as  |  [Has97j. 

The  approximation  ratio  of  [BGXT194]  for  the  unweighted  FVS  was  subse¬ 
quently  extended  to  the  one  for  the  weighted  FVS  and  was  further  improved 
to  2  in  [BBF95,  BG94],  matching  the  best  constant  factor  known  for  VC.  Re¬ 
cently  Chudak  et  al.  [CGHW96]  gave  a  primal-dual  interpretation  of  these  2- 
approximation  algorithms  of  [BBF95,  BG94].  They  also  provided  a  new  primal- 
dual  algorithm  for  FVS,  which  has  the  same  performance  ratio  but  is  slightly 
simpler  than  the  previous  two. 


2  Preliminaries 

2.1  Notation  and  Definitions 

For  any  graph  G  let  V'(G)  and  E{G)  denote  the  vertex  set  and  the  edge  set, 
respectively,  of  G.  The  subgraph  of  G  =  {V,E)  induced  by  X  CV  is  denoted  by 
G'[A‘].  Let  ^[A']  denote  the  set  of  edges  induced  by  A’  C  V,  and  conversely,  let 
V[F]  for  F  C  E  denote  the  set  of  vertices  incident  to  some  edge  in  F.  ^[A',  Y]  is 
the  set  of  edges  with  one  end  in  X  and  the  other  in  Y.  The  set  of  edges  incident 
to  some  node  of  A'  is  denoted  <5(A')  and  when  those  edges  are  restricted  to  the 
ones  in  a  subgraph  G[Y]  we  denote  it  by  (5y  (A')(=  d(A")n£[y]).  Let  S{}i)  (()V(u), 
resp.)  be  a  shortening  of  (5({u.})  (dy({?/.}),  resp.). 

A  graph  property  tt  is  nontrivial  if  infinitely  many  graphs  satisfy  ~  and  in¬ 
finitely  many  graphs  fail  to  satisfy  it.  It  is  hereditary  (on  induced  subgraphs)  if. 


752 


in  anv  graph  satisfying  tt,  every  (node-induced,  resp.)  subgraph  also  satisfies 
TT.  For  a  hereditary  property  tt  any  graph  which  does  not  satisfy  tt  is  called  a 
forbidden  graph  for  tt,  and  it  is  a  minimal  one  if,  additionally,  every  "proper  ’ 
(induced,  resp.)  subgraph  of  it  satisfies  tt.  Any  hereditary  property  tt  is  equiva¬ 
lently  characterized  by  the  set  of  all  minimal  forbidden  graphs  for  tt. 

It  is  customary  to  measm-e  the  quality  of  an  approximation  algorithm  by  its 
'pe.vfoTm.ancR  vatio^  which  is  the  worst  case  ratio  of  the  optimal  solution  value  to 
the  value  of  an  approximate  solution  returned  by  the  algorithm. 

2.2  Matroidal  Properties 

One  way  to  represent  a  matroid  M  is  by  a  pair  of  a  ground  set  E  and  a  rank 
function  r  defined  on  2^.  A  set  F  Q  E  is  called 

~  independent  if  r(F)  =  \F\  (and  conversely,  r{F')  is  the  cardinality  of  a  largest 
independent  subset  of  F'  for  an  arbitrary  F'  C  E), 

-  dependent  if  r[F)  <  |F|, 

-  a  base  if  it  is  a  maximal  (and  hence,  maximum  in  any  matroid)  independent 
set,  and  a  circuit  if  it  is  a  minimal  dependent  set, 

-  spanning  if  r(F)  =  c[E). 

For  any  matroid  M  =  (F,  r)  there  is  the  dual  m,atroid  =  (F,  r^)  defined  on 
the  same  ground  set  F.  The  rank  functions  r  and  are  related  s.t. 

r^(F-F)  =  {|Fl-r(F))-(|F|-r(F)) 

for  any  F  C  F  (For  more  on  matroid  theory  see,  for  instance,  [WelTG]). 

Let  M  be  a  matroid  which  can  be  defined  on  the  edge  set  of  any  graph  (called 
an  edge  set  m.atroid)  and  denote  by  M{G)  the  matroid  defined  by  M  on  the  edge 
set  of  G.  To  avoid  any  possible  anomaly  we  stipulate  that  for  any  subgraph  H 
of  G.  M{H)  is  the  restriction  of  M{G)  onto  E{H).  This  means  that  the  rank 
function  of  M(F)  is  that  of  iV/(G),  but  its  domain  restricted  to  subsets  of  E{H). 

We  say  that  a  graph  property  n  is  matroidal  if  for  some  edge  set  matroid 
M  a  (sub)graph  G  satisfies  tt  iff  its  edge  set  is  independent  in  M{G)  (Such  a 
property  is  said  to  be  derived  from  the  matroid  M).  Such  a  property  is  hereditary 
on  induced  subgraphs  because  a  subset  of  an  independent  set  is  independent  in 
anv  matroid.  Therefore,  node— deletion  problems  for  any  nontrivial  matroidal 
properties  are  A-P-hard  and  MAX  SAT-hard  according  to  the  results  of  [LYSO] 
and  [LYOa].  Also  note  that  the  family  of  minimal  forbidden  graphs  for  such  a 
property  tt  corresponds  to  the  family  of  circuits  of  the  corresponding  matroid 
M{G)  for  all  possible  G. 

2.3  Matroidal  Families  of  Graphs 

A  m-atroidal  family  of  graphs  is  a  non-empty  collection  P  of  finite,  connected 
graphs  with  the  following  property:  given  an  arbitrary  graph  F,  the  edge  sets  of 
the  su]-)graphs  of  G  that  are  isomorphic  to  some  member  of  P  are  the  circuits 
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of  a  iiiatroid  on  E{G).  The  matroid  defined  this  way  by  the  matroidal  family  P 
on  the  edge  set  of  graph  G  will  be  denoted  by  P{G). 

The  following  four  matroidal  families,  P07  ^d  P3,  ai*e  those  that  were 

discovered  first  [Sim72,  Sim73].  The  family  Pq  consists  of  one  graph  only,  namely 
two  nodes  with  one  edge  in  between.  This  is  also  the  only  finite  matroidal  family. 
The  family  Pi  consists  of  all  the  cycles:  thus,  Pi{G)  is  the  cycle  matroid  defined 
on  E{G).  The  family  P2  consists  of  all  the  bicycles,  where  a  bicycle  is  a  graph 
formed  by  minimally  connecting  two  independent  cycles.  These  two  cycles  can 
be  joined  together  by  either  (1)  shaidng  only  a  single  node,  (2)  sharing  only  a 
connected  path,  or  (3)  having  a  simple  path  attached  only  at  each  end  of  it. 
The  family  P3  consists  of  all  the  even  cycles  (i.e.  cycles  of  even  length)  and  the 
bicycles  with  no  even  cycle.  The  matroidal  properties  derived  from  these  families 
thns  correspond,  respectively,  to  ^‘a  graph  has  no  edge’  (Pq),  graph  contains 
no  cycle"  (Pi),  “every  connected  component  contains  at  most  one  cycle”  (P2), 
and  “every  connected  component  contains  at  most  one  odd  cycle  and  no  even 
cycle”  (P3).  Therefore,  ND(7r)  is  actually  the  VC  (FVS,  respectively)  problem 
when  TT  is  the  matroidal  property  derived  from  Pq  (Pi,  respectively). 

It  has  been  known  that  in  fact  there  exist  infinitely  many  (uncoimtably  many) 
matroidal  families  of  graphs,  and  the  first  description  of  them  (coimtably  many 
matroidal  families)  was  obtained  by  Andreae: 

Proposition!  [And78].  Let  s  and  t  be  integers,  .9  >  0  and  —2s  +  1  <  t  <  1. 
Let  Ps^t  he  the  set  of  all  graphs  G  s,t. 

(i)  .s|V'(G)|  +  t=  |P(G)|,  and 

(ii)  G  is  minimal  with  respect  to  property  (i);  i.e.,  no  graph  isomorphic  to  a 
proper  .subgraph  of  G  .satisfies  property  (i). 

Then  Pgj.  i'"^  ^  m.atro'idal  faw.ily. 

It  is  not  so  hard  to  verify  that  Pi  =  Pi,o,  P2  =  ^nd  Pq  =  Ps,_2s4-i  (P3  is 
not  of  the  form  Ps,i)- 

3  Primal-Dual  Approximation  for  Matroidal  Properties 

One  of  the  most  natural  integer  program  formulations  of  ND(7r),  presented  here 
for  the  sake  of  comparison,  is  the  one  for  a  “covering  problem” : 

Min  Y,  TCu-Tu 

u€V' 

subject  to: 

^  >  1  He 

uev(H) 

Xu  €  {0, 1}  u  e  V 

where  is  the  set  of  minimal  forbidden  graphs  of  ND(7r)  contained  as 

subgraphs  in  G.  It  was  indicated  in  [EXSZ96]  that  in  case  of  the  FVS  problem 
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the  linear  relaxation  of  the  formulation  above  introduces  the  integrality  gap  (i.e., 
the  ratio  between  the  integer  and  fractional  optima)  of  size  as  large  as  i7(log  |V  |). 
We  shall  show  later  that  there  exists  another  formulation  of  which  integrality  gap 
is  bounded  by  2  for  many  XD{7r)'s  including  FVS  (see  Corollary  9).  Chudak  et 
al.  gave  new  primal-dual  formulations  and  the  algorithms  based  on  them  for  the 
FVS  problem  in  undirected  graphs  [CGHWOG].  These  algorithms  are  not  new 
ones  Init  actually  are  primal-dual  ""interpretations”  of  the  algorithms  previously 
known  from  [BBF95,  BG94].  shall  show  below  that  in  fact  every  XD(7r)  with 
matroidal  tt  has  a  simple  and  identical  primal-dual  formulation  as  well  as  an 
algorithm  based  on  it.  Chudak  et  al.  also  gave  a  new  algorithm  for  the  FVS 
problem  wliich  is  a  slight  simplification  of  the  previous  algorithms  cited  above. 
Our  algorithm  for  ND(7r)  is  even  simpler  than  theirs. 

We  claim  that  ND(7r)  on  graph  G  ~  (V,  E)  can  be  formulated  by  the  following 
integer  program  when  tt  is  a  matroidal  property  derived  from  M  =  {E{G)s  r): 

Min  WiiXu 

uev 

subject  to: 

(IP)  ^^r^(ds(u)).r,,  >  r'^(F;[5'l)  S  C  V* 

xTe  {0, 1}  u  e  V 

Theorem  2.  When  tt  is  a  matroidal  property,  F  is  a  solution  of  ND(7r)  iff  x/  G 
{0,1}^"  (incMence  vector  of  F)  is  a  feasible  solution  to  (IP). 

Consider  now  the  diial  of  the  linear  programming  relaxation  of  (IP): 


Max  ^  r'^(E[5l)^5 

scv 

subject  to: 

(D)  ^ 

S:ues 

ys>0  5  C  I' 

The  primal-dual  approximation  algorithm,  based  on  (IP)  and  (D)  above,  for 
XD(7r)  with  matroidal  tt  is  presented  in  Fig.  1.  We  elaborate  more  on  it.  The 
algorithm  starts  with  F  =  0,  the  original  graph  G[5']  =  (V,  F)  and  the  dual 
feasil:)le  solution  y  =  0.  Given  F,  if  it  is  not  yet  a  solution  of  XD(7r)  there  must 
exist  some  set  S  CV  corresponding  to  a  violated  constraint  of  (IP).  In  particulai' 
the  set  of  all  the  remaining  nodes  S'{—  V  —  F)  must  be  always  such  a  set,  and 
thus  we  can  always  choose  S'  as  a  “violated  set”.  The  algorithm  then  increases 
the  dual  vai’iable  ys>  as  much  as  possible  until  for  some  node  n  in  S'  the  dual 
constraint  for  u  becomes  tight;  i.e.,  ^s:u£S  y*'’' 

here  can  be  indeed  increased  because  S'  is  the  collection  of  all  those  nodes  whose 
corresponding  dual  constraints  were  not  yet  tight.  The  algorithm  adds  ?/.  into  a 
solution  set  F  and  at  the  same  time  removes  it  from  S'.  Clearly  F  eventually 
becomes  a  solution  of  XD(;r)  (and  to  (IP))  while  y  is  kept  feasible  to  (D).  Lastly, 


755 


Initialize  F  =  0.  5'  =  V",  ]/  =  0,  /  =  0. 

While  F  is  not  a  solution  of  ND(7r)  do 
/  /  +  1. 

Increase  ijs'  until  for  some  u  €  S'  the  dual  constraint  corresponding  to  u 
becomes  tight. 

Let  III  f-  ii. 

Add  ui  into  F  and  remove  u  from  S'. 

For  j  =  /  downto  1  do 

If  F  -  {uj}  is  a  solution  of  ND(7r)  in  G  then  remove  Uj  from  F. 

Output  F. 

Fig.  1.  Primal-Dual  Approximation  Algorithm  for  ND(7r) 


the  nodes  in  F  are  examined  one  by  one,  in  the  reverse  order  of  their  incliLsion 
to  F,  and  whenever  any  of  them  is  found  to  be  extraneous  it  is  thrown  out  of  F. 

The  algorithm  dearly  constructs  a  feasible  solution  F  of  ND(7r )  and  a  solution 
y  feasible  for  (D).  These  two  solutions  are  related  such  that 

E =  E  E  =  E(  E  (1) 

ueF  ueF  S:ueS  5CV  ueSDF 

An  analysis  of  this  algorithm  reduces  its  performance  ratio  to  the  following 
combinatorial  bound. 

Theorems.  Let  n  he  a  m,atroidal  property  derived  from,  M  =  (F(G),r).  Then 
the  performance  ratio  of  the  priw.al~dual  algorithm,  is  bounded  by 

E 

r  -I 

r^(E{G))  ^ 

where  max  i-s  taken  over  any  m,inim,al  solution  X  of  NDfn)  in  any  graph  G. 


4  Uniformly  Sparse  Graph  Properties 

It  was  shown  in  [FujOG]  that  when  tt  is  derived  either  from  Pq  or  Pi  (i.e.,  the  \  C 
or  FVS  property)  (an  essentially  same  algorithm  as)  the  primal-dual  algorithm 
delivers  a  solution  with  approximation  ratio  of  2.  We  add  here  one  more  to  this 
list: 

Theorem  4.  When  n  is  derived  from,  F3  the  primal-dual  algorithm  for  NDfir) 
has  perform.ance  ratio  of  2. 

The  case  of  F2  =  A,i  will  be  subsumed  by  the  general  result  given  below. 

We  now  turn  our  attention  to  a  **relaxatiom  of  the  matroidal  families  of 
graphs,  dropping  the  connectivity  recpiirement  on  graphs  in  the  families.  Recall 
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the  countably  many  matroidal  families  Pg  i  (s  >  0,  — 2.s  +  1  <  ^  <  1)  of  graphs 
from  Proposition  1.  Fix  .s  to  L  let  ^  be  any  integer  >  -2.s+ 1  =  -F  and  consider 
the  sets  of  graphs,  that  are  no  longer  necessary  to  be  connected,  using  the  same 
set  of  the  definitions  for  Psj:  i.e.,  Pi^t  is  the  vset  of  all  graphs  G  s.t. 

(i)  |V  (6')|  +  f  =  |£'(G-')|,  and 

(ii)  G  is  minimal  with  respect  to  property  (i);  i.e.,  no  graph  isomorphic  to  a 
proper  subgraph  of  G  satisfies  property  (i). 

Let  Qk  =  Pi,k+i  for  k  >  -2. 

Propositions.  Qk  defines  the  set  of  circuits  of  a  matroid  on  any  (edge  set  of) 
graph  for  all  k. 

It  is  useful  to  observe  here  what  graph  properties  are  actually  derived  from  Qk^- 
A  graph  G  —  (V,  E)  satisfies  the  property  iff  for  every  F  CE,  |F|  -  |F[F]|  <  k, 
and  thus  we  may  call  a  graph  with  such  a  property  uniformly  k-sparse. 

We  should  also  note: 

Propositions.  Qk  consists  of  an  infinite  number  of  distinct  graphs  for  all  k  > 
-1. 

The  next  is  a  key  lemma  of  the  present  paper,  proof  of  which  is  postponed  till 
Sec.  5. 

Lemma  7.  Let  rc  be  a  property  derived  from  Qk  =  {E{G),  r).  Suppose  X  C  V  (G') 
is  any  minim, al  solution  of  ND(n)  in  any  G.  Then, 

Yi  ^  2  ■  AE{G)). 

uex 

Finally,  observe  that  given  G  ~  (V,  E)  w^e  can  compute  efficiently  the  rank  r{F) 
of  any  F  C  F  (and  thus  r^{6{u))  for  each  u  €  V)  under  Qk[G)  (for  instance, 
using  the  formula  (2)).  Therefore,  our  primal-dual  algorithm  runs  in  polynomial 
time  for  every  Qt-  Now  from  Lemma  7  and  Theorem  3  it  easily  folio w\s  that 

Theorem  8.  When  tt  is  the  property  derived  from,  Qk  for  any  fixed  k  the  primal- 
dual  algorithm,  com,putes  a  solution  of  ND(tt)  in  polynomial  time;  its  perform, ance 
ratio  is  bounded  above  by  2. 

And  hence,  there  exist  at  least  countable  many  nontrivial  hereditary  properties 
with  an  infinite  number  of  minimal  forbidden  graphs,  for  which  the  node-deletion 
problems  are  efficiently  approximable  to  within  a  factor  of  2. 

We  also  deduce  from  Lemma  7,  (1),  and  the  fact  that  F  H  5  is  a  minimal 
solution  in  G[5]  whenever  ys  is  nonzero,  that  the  integrality  gap  in  our  formu¬ 
lation  is  at  most  2  when  tt  is  derived  from  Qk-  Let  Z'lp  and  Zjy  be  the  optimal 
values  of  (IP)  and  (D),  respectively.  And  then,  for  any  F  and  y  computed  by 
the  primal-dual  algorithm, 

Zlr-  <  Y  H  H  <  Y 

uev  scv  uesnF  scv 


<2Z}y. 
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Corollary  9.  When  r  is  the,  rank  function  of  Qk  for  any  k,  <  2. 

5  Proof  of  Lemma  7 

Definitions.  Let  C  be  a  connected  component.  Define  the  surplus  of  C  Iw 
sp{C)  '^=  \E(C)\  -  |V'(C)1  and  the  bounded  surplus  sp  of  C  by  .sp(C)  = 
imn{k,sp{C)}.  Let  C+(F)  (C”(F))  denote  the  set  of  components,  induced  by  an 
edge  set  F,  with  a  positive  bounded  surplus  (with  a  negative  bounded  surplus, 
respectively).  When  E'  is  an  edge  subset  of  E  define  sp{E')  to  be  the  surplus  of 
the  graph  induced  by  E' . 

Notice  that  C~{F)  consists  of  all  the  acyclic  components,  each  with  a 
(bounded)  sinplus  of  -1,  induced  by  F.  Also  notice  that  for  any  component 
C  and  for  any  E'  C  F(C),  sp{E')  <  sp{C).  The  rank  function  of  the  matroid 
Qk{G)  defined  on  G  =  (V,E)  can  be  given  by 

r(F)  =  |V[F]|  +  min{A:,  ^  5p(C)}  -  |C"(F)|  (2) 

C€C+(F) 

for  any  F  C  E. 

Assume  throughout  that  A:  >  0  (the  case  of  Ar  <  -1  is  no  harder).  Consider 
first  the  edge  set  E[V  -  X],  which  must  be  an  independent  set  of  Qk{G),  since 
X  is  a  solution  of  ND(7r).  Using  (2)  we  have 

|F[V*-A]|  =  r(F[V'-Al) 

=  \V  —  A'l  —  (#  of  acyclic  components  in  G\V  —  A'])  +  /  (3) 

for  some  0  <  I  <  k.  We  shall  use  the  following  auxiliary  lemma  in  proving 
Lemma  7. 

Lemma  10.  Assume  (3).  If  X  is  a  minim.al  solution  of  ND (it)  then 

|U[XV'-A]|>(A:-/  +  1)|A-|  + 

of  acyclic  components,  in  G[V  —  X],  adjacent  to  n){4) 

iiex 

Suppose  G  contains  an  acyclic  component  T.  Then  since  A’  contains  no  node 
of  F,  due  to  its  minimality,  we  can  restrict  ourselves  w.l.o.g.  to  G  without  T. 
So  assume  that  G  contains  no  acyclic  component.  Now  suppose  r(F)  <  1V'*|  +  A:. 
Then  E  must  be  independent  in  Qk{G)  and  G  satisfies  tt.  But  then  a  solution 
A'  minimal  in  G  must  be  empty  and  the  inequality  in  question  trivially  holds. 
So  assume  that  r(F)  =  |V’|  +  k,  and  using  (3)  we  can  write 

r\E)  =  \E\  -  r(E) 

=  |F[A]|  +  |F[A',  y  -  A']|  -  (r(F)  -  \E[V  -  AJI) 

=  |F[A]|  +  |F[XV'-X]| 

—  (|A'|  4-  A:  —  /  +  (#  of  acyclic  components  in  G[V  —  A’]))  (5) 
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Assume  that  |A'|  >  2  (the  case  of  \X\  <  1  is  more  straightforward  and 
omitted  here).  Call  such  a  component  in  G[V  -  X]  that  is  adjacent  to  a  single 
node  in  X  as  a  leaf  component.  Recall  that  the  dual  rank  of  any  E'  C  E  under 
a  matroid  M  can  be  equivalently  defined  by 

r^{E^)  =  max{|£’'  -  R|  :  B  is  a  base  of  A/} 

Take  any  node  u  of  X.  To  estimate  the  value  of  r^((5(?/))  we  observe  how’^  many 
edges  incident  to  u  must  belong  to  a  base  of  Qk{G),  Let  I^J  C  E  be  mutually 
disjoint  sets  s.t.  I  is  an  independent  set  and  sp{J)  <  0.  And  then,  lUJ  in  general 
is  an  independent  set  of  QkiG).  This  observation  allows  ils  to  argue  that,  for 
every  acyclic  leaf  component  T  which  is  adjacent  only  to  u,  any  base  B  of  QkfG) 
must  use  all  the  edges  in  T  and  (at  least)  one  edge  connecting  u  and  T.  Besides 
them  B  must  use  at  least  one  more  edge  from  ^(7/).  To  see  why  notice  first  that  if 
no  other  edges  of  J(7/)  belong  to  B  the  component  of  B  containing  u  is  a  tree  at 
best.  As  observed  above,  however,  it  is  always  possible  to  extend  this  component 
by  one  more  edge  incident  to  it  if  it  exists.  So  it  remains  to  see  that  at  least  one 
more  edge  is  incident  to  u,  and  this  is  easy  to  do  for,  otherwise,  u  belongs  to  an 
acyclic  component  of  G,  which  weVe  excluded  at  the  beginning  of  the  current 
analysis.  Therefore,  we  can  write 

r^( ()(?/,))  <  |(5(7/.)|  -  {(#  of  acyclic  leaf  components  adjacent  to  ?/)  +  1) 
and  hence, 


^  r^{5(n))  <  2|E[A-]|  +  |f;[A,  V  -  A]!  -  lA'I 

uG.V 


~(#  of  acyclic  leaf  components) 


(G) 


Notice  that,  since  there  is  no  isolated  acyclic  component  in  G,  we  can  reduce  (4) 
to 


|£’[X,  V  —  X]\>  {k  —  I  -b  1)1  A'l  4-  2(#  of  acyclic  non-leaf  components) 

+{#  of  acyclic  leaf  components)  (7) 

Combining  (5),  (6)  and  (7), 

2, ■''(£)  - 

uex 

>  iBfA',  V  —  A']|  —  2(|A'|  +  A:  —  /  4-  (#  of  acyclic  components  in  G[V'  —  A’])) 
4-|A'|  +  (#  of  acyclic  leaf  components  in  G[V  —  .A]) 

=  |£;[A’,y-A]|-(|A'|4-2(A:-/)) 

—  (2(#  of  acyclic  non-leaf  components)  +  (#  of  acyclic  leaf  components)) 

>{k-l){\X\-2) 

>  0 
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Abstract  An  asteroidal  triple  is  a  set  of  three  vertices  such  that  there  is  a  path  between 
any  pair  of  them  avoiding  the  closed  neighborhood  of  the  third.  A  graph  is  called  AT- 
free  if  it  does  not  have  an  asteroidal  triple.  We  show  that  there  is  an  0(n^  •  (m  +  1)) 
time  algorithm  to  compute  the  maximum  cardinality  of  an  independent  set  for  AT- 
free  graphs,  where  n  is  the  number  of  vertices  and  m  is  the  number  of  non  edges 
of  the  input  graph.  Furthermore  we  obtain  0(n^  •  (rn  +  1))  time  algorithms  to  solve 
the  INDEPENDENT  DOMINATING  SET  and  the  INDEPENDENT  PERFECT  DOMINATING  SET 
problem  on  AT-free  graphs.  We  also  show  how  to  adapt  these  algorithms  such  that 
they  solve  the  corresponding  problem  for  graphs  with  bounded  asteroidal  number  in 
polynomial  time.  Finally  we  observe  that  the  problems  clique  and  partition  into 
CLIQUES  remain  NP-compIete  when  restricted  to  AT-free  graphs. 


1  Introduction 

Asteroidal  triples  were  introduced  in  1962  to  characterize  interval  graphs  as  those  chordal 
graphs  that  do  not  contain  an  asteroidal  triple  (short  AT)  [20].  Graphs  not  containing  an  AT 
are  called  asteroidal  triple-free  graphs  (short  AT-free  graphs).  They  form  a  large  class  of 
graphs  containing  interval,  permutation,  trapezoid  and  cocomparability  graphs.  Since  1989 
AT-free  graphs  have  been  studied  extensively  by  Cornell,  Olariu  and  Stewart.  They  have 
published  a  collection  of  papers  presenting  many  structural  and  algorithmic  properties  of 
AT-free  graphs  (see  e.g.  [6,  7]).  Further  results  on  AT-free  graphs  were  obtained  in  [18,  23]. 

Up  to  now  the  knowledge  on  the  algorithmic  complexity  of  NP-complete  graph  problems 
when  restricted  to  AT-free  graphs  was  relatively  small  compared  to  other  graph  classes.  The 
problems  TREEWIDTH,  pathwidth  and  minimum  RLL-in  remain  NP-complete  on  AT-free 
graphs  [1,  25].  On  the  other  hand,  domination-type  problems  like  CONNECTED  DOMINATING 
SET  [7],  DOMINATING  SET  [19]  and  TOTAL  DOMINATING  SET  [19]  can  be  solved  by  polyno¬ 
mial  time  algorithms  for  AT-free  graphs.  However  there  is  a  collection  of  classical  NP- 
complete  graph  problems  for  which  the  algorithmic  complexity  when  restricted  to  AT-free 
graphs  was  not  known.  Prominent  representatives  are  INDEPENDENT  SET,  CLIQUE,  GRAPH 
/j-COLORABILITY,  PARTITION  INTO  CLIQUES,  HAMILTONIAN  CIRCUIT  and  HAMILTONIAN  PATH. 

A  crucial  reason  for  the  lack  of  progress  in  designing  efficient  algorithms  for  NP- 
complete  problems  on  AT-free  graphs  seems  to  be  that  none  of  the  typical  representations, 


761 


that  are  useful  for  the  design  of  efficient  algorithms  on  special  graph  classes,  is  known  to 
exist  for  AT-free  graphs.  Contrary  to  well-known  graph  classes  such  as  chordal,  permutation 
and  circular-arc  graphs,  AT-free  graphs  do  not  seem  to  have  a  representation  by  a  geometric 
intersection  model,  an  elimination  scheme  of  vertices  or  edges,  small  separators,  a  small 
number  of  minimal  separators  etc.  However  it  turns  out  that  the  design  of  all  our  algorithms  is 
supported  by  a  structural  property  of  AT-free  graphs,  that  can  be  obtained  from  the  definition 
of  AT-free  graphs  rather  easily. 

Our  approach  in  this  paper  is  similar  to  the  one  used  to  design  algorithms  for  problems 
such  as  TREEWIDTH  [14,  17]  MINIMUM  FILL-IN  [17]  and  VERTEX  RANKING  [18]  on  AT-free 
graphs.  However  these  algorithms  have  polynomial  running  time  only  under  the  additional 
constraint  that  the  number  of  minimal  separators  is  bounded  by  a  polynomial  in  the  number 
of  vertices  of  the  graph.  (Notice  that  all  three  problems  are  NP-complete  on  AT-free  graphs.) 
Technically,  for  the  three  different  independent  set  problems  in  this  paper,  we  are  able  to 
replace  the  set  of  all  minimal  separators,  used  in  [14,  17,  18]  -  which  might  be  ‘too  large’ 
in  size  -  by  the  ‘small’  set  of  all  closed  neighborhoods  of  the  vertices  of  the  graph. 

Finding  out  the  algorithmic  complexity  of  INDEPENDENT  SET  on  AT-free  graphs  is  a 
challenging  task.  Besides  the  fact  that  INDEPENDENT  SET  is  a  classical  and  well-studied  NP- 
complete  problem,  the  problem  is  also  interesting  since,  contrary  to  well-known  subclasses 
of  AT-free  graphs  such  as  cocomparability  graphs,  not  all  AT-free  graphs  are  perfect.  Thus 
the  polynomial  time  algorithm  for  perfect  graphs  of  Grotschel,  Lovasz  and  Schrijver  [11] 
solving  the  INDEPENDENT  SET  problem  does  not  apply  to  AT-free  graphs. 

We  present  the  first  polynomial  time  algorithm  solving  the  NP-complete  problem  IN¬ 
DEPENDENT  SET,  when  restricted  to  AT-ffee  graphs.  More  precisely,  our  main  result  is  the 
0(n^  ■  (m  +  1))  algorithm  to  compute  the  maximum  cardinality  of  an  independent  set  in  an 
AT-free  graph.  Furthermore  we  present  an  0(n^  •  (m  -b  1 ))  time  algorithm  to  solve  the  prob¬ 
lem  INDEPENDENT  DOMINATING  SET.  A  similar  algorithm  solves  the  problem  INDEPENDENT 
PERFECT  DOMINATING  SET  in  time  0(n^  •  (m  -f  1))  [3].  We  also  observe  that  the  problems 
CLIQUE  and  PARTITION  INTO  CLIQUES  remain  NP-complete  when  restricted  to  AT-free  graphs. 

A  natural  generalization  of  asteroidal  triples  are  the  so-called  asteroidal  sets.  Structural 
results  for  asteroidal  sets  and  algorithms  for  graphs  with  bounded  asteroidal  number  were 
obtained  in  [15,  21].  Computing  the  asteroidal  number  (i.e.,  the  maximum  cardinality  of  an 
asteroidal  set)  turns  out  to  be  NP-complete  in  general,  but  solvable  in  polynomial  time  for 
many  graph  classes  [16].  Furthermore  the  results  for  problems  as  TREEWIDTH  and  MINIMUM 
FILL-IN  on  AT-free  graphs  can  be  generalized  to  graphs  with  bounded  asteroidal  number  [15]. 
We  show  how  to  adapt  our  algorithms  to  obtain  polynomial  time  algorithms  for  graphs 
with  bounded  asteroidal  number  solving  the  problems  INDEPENDENT  SET,  INDEPENDENT 
DOMINATING  SET  and  INDEPENDENT  PERFECT  DOMINATING  SET. 


2  Preliminaries 

For  a  graph  G  =  [V^E)  we  denote  |y|  by  n,  |£;i  by  m  and  the  number  of  edges  of  the 
complement  of  G,  which  is  equal  to  the  number  of  non  edges  of  G,  by  m. 

Recall  that  an  independent  set  in  a  graph  G  is  a  set  of  pairwise  nonadjacent  vertices. 
The  independence  number  of  a  graph  G  denoted  by  q:(G)  is  the  maximum  cardinality  of  an 
independent  set  in  G. 
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For  a  graph  G  =  (V,  E)  and  W  C  V,  G\W]  denotes  the  subgraph  of  G  induced  by  the 
vertices  of  W\  we  write  a{W)  for  a(G[PF’]).  For  convenience,  for  a  vertex  a:  of  G  we  write 
G  -  X  instead  of  G[V  \  {a:}].  Analogously,  for  a  subset  X  CV  write  G  -  X  instead  of 
G\y  \  A"].  We  consider  components  of  a  graph  as  (maximal  connected)  subgraphs  as  well 
as  vertex  subsets.  For  a  vertex  a:  of  G  =  (V,  E),  N{x)  =  {y  E  V  :  {x,y}  E  E}  is  the 
neighborhood  of  x  and  N[x]  -  N{x)  U  {x}  is  the  closed  neighborhood  of  x.  For  W  CV, 

A  set  S  C  V  IS  a  separator  of  the  graph  G  =  {V,  E)  if  G  —  S  is  disconnected. 

Definition  1.  Let  G  =  {V,  E)  be  a  graph.  A  set  f?  C  y  is  an  asteroidal  set  if  for  every 
X  G  i?  the  set  i?  \  {x}  is  contained  in  one  component  of  G  —  Wfx].  An  asteroidal  set  with 
three  vertices  is  called  an  asteroidal  triple  (short  AT). 

Notice  that  every  asteroidal  set  is  an  independent  set. 

Remark.  A  triple  {x,  y,  z}  of  vertices  of  G  is  an  asteroidal  triple  if  and  only  if  for  every  two 
of  these  vertices  there  is  a  path  between  them  avoiding  the  closed  neighborhood  of  the  third. 

Definition  2.  A  graph  G  =  {V,  E)  is  called  asteroidal  triple-free  (short  AT-free)  if  G  has 
no  asteroidal  triple. 

It  is  well-known  that  the  INDEPENDENT  SET  problem  ‘Given  a  graph  G  and  a  positive  inte¬ 
ger  k,  decide  whether  0!(G)  >  k\  is  NP-complete  [9].  The  problem  remains  NP-compIete, 
even  when  restricted  to  cubic  planar  graphs  [13].  Moreover  the  independence  number  is 
hard  to  approximate  within  a  factor  of  for  any  constant  e  >  0  [12].  Despite  this  dis¬ 
couraging  recent  result  on  the  complexity  of  approximation,  the  independence  number  can 
be  computed  in  polynomial  time  on  many  special  classes  of  graphs  (see  [13]).  For  example, 
the  best  known  algorithm  to  compute  the  independence  number  of  a  cocomparability  graph 
has  running  time  G(n  +  m)  [24]. 

The  main  result  of  this  paper  is  an  0{n^  •  (m  -1-  1))  algorithm  to  compute  the  maximum 
cardinality  of  an  independent  set  in  a  given  AT-free  graph.  The  structural  properties  enabling 
the  design  of  our  algorithms  are  given  in  the  next  three  sections.  In  this  extended  abstract, 
we  restrict  ourselves  to  the  cardinality  case  of  the  problems.  Nevertheless  our  algorithms  can 
be  extended  in  a  straightforward  manner  such  that  they  solve  the  corresponding  problems 
on  graphs  with  real  vertex  weights  (see  [3]). 

3  Intervals 

Let  G  =  (V,  E)  be  an  AT-free  graph,  and  let  x  and  y  be  two  distinct  nonadjacent  vertices  of 
G.  Throughout  the  paper  we  use  G^{y)  to  denote  the  component  of  G  —  N[x]  containing  y, 
and  r(x)  to  denote  the  number  of  components  of  G  —  A^lx]. 

Definition  3.  A  vertex  z  eV  \  {x,  y}  is  between  x  and  yifx  and  z  are  in  one  component 
of  G  -  77 [y]  and  y  and  2:  are  in  one  component  of  G  —  N[x]. 

Equivalently,  2  is  between  x  and  y  in  G  if  there  is  an  x,  z-path  avoiding  N[y]  and  there 
is  a  y,  z-path  avoiding  iV[x]. 

Definition  4.  The  interval  I  =  I{x,  y)  of  G  is  the  set  of  all  vertices  of  G  that  are  between 
X  and  y. 

Thus  /(x.  y)  =  C^{y)  H  G^(x). 
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4  Splitting  intervals 

Let  G  =  (V,  E)  be  an  AT-free  graph,  let  I  =  I{x,y)  be  a  nonempty  interval  of  G  and  let 
s  e  I.  Let  I]  =  I{x,  s)  and  E  =  /(s,  y). 

Lemmas.  The  vertices  x  and  y  are  in  different  components  ofG  -  A[s]. 

Proof.  Assume  x  and  y  would  be  in  the  same  component  of  G  -  N[s].  Then  there  is  an 
X,  y-path  avoiding  N[s].  However  s£  I  implies  that  there  is  an  s,  y-path  avoiding  N[x]  and 
an  s,  x-path  avoiding  iV[y] .  Thus  {5,  x,  y}  is  an  AT  of  G,  a  contradiction.  □ 

Corollary  6.  /i  fi  J2  =  0- 

Proof  Assume  z  ^  I\  n  h-  Then  z  G  I\  implies  that  there  is  a  component  of  G  -N[s] 
containing  both  x  and  z.  Furthermore  z  €  h  implies  that  also  y  €  contradicting 
Lemma  5.  ^ 

Lemma?.  I]  C  I  and  I2  Q  T 

Proof  Let  z  E  Clearly  s  €  I  implies  5  6  C^{y).  Thus  z  E  h  implies  z  G  C^(y). 
Clearly  z  G  G^(x)  since  z  G  /i-  By  Lemma  5,  G'®{x)  is  contained  in  a  component  of 
G  ~  N[y]  and  obviously  this  component  contains  x.  This  proves  z  E  1.  Consequently 
/,  CL 

L  C  I  can  be  shown  analogously.  ^ 

Theorems.  There  exist  components  Gf ,  Gf , . .  • ,  G^  ofG  -  TVls]  such  that 

t 

/\iV[s]  =  /lU/2uUG^ 

i=\ 

Proof  By  Lemma  7,  we  have  /i  C  /  \  Nls]  and  I2  Q  ^  \  ^l^j-  By  Lemma  5,  x  and  y 
belong  to  different  components  G®(x)  and  C^{y)  of  G  ~  77(5].  Let  z  G  /  \  ^[5]. 

Assume  z  G  G^(x).  There  is  a  z,  y-path  avoiding  Ar[xj.  This  path  must  contain  a  vertex 
of  N[s],  showing  the  existence  of  a  z,  s-path  avoiding  N[x].  Hence  z  G  /i . 

Similarly  z  G  G^(y)  implies  z  E  h- 

Assume  z  ^  G"(x)  and  z  ^  G^(y).  Since  z  ^  N[sf  z  belongs  to  the  component 
G"(z)  of  G  -  N[s].  For  any  vertex  p  E  G^(z),  there  is  a  p,  z-path  avoiding  iV[x],  since 
C^[z)  ^  G''(x).  Since  z  G  /,  there  is  a  z,y-path  avoiding  N[x].  Hence  there  is  also  a 
p,  y-path  avoiding  N[x].  This  shows  G^(z)  CI\  iV[s].  ^ 

Corollary  9.  Every  component  of  G[I  \  (iVfs]  U  /i  U  h)]  is  a  component  of  G  -  ^[s]. 


5  Splitting  components 

Let  G  =  {V,  E)  be  an  AT-free  graph.  Let  G^  be  a  component  of  G  -  N[x]  and  let  y  be  a 
vertex  of  .  We  study  the  components  of  the  graph  G^  ~  iV[y] . 

Theorem  10.  Let  D  be  a  component  ofC^  -  N[y].  Then  N[D]  O  (.Vfx)  \  N[y])  =  0  if  and 
only  if  D  is  a  component  of  G  -  A'[y). 
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Proof.  Let  J9  be  a  component  of  —  iV[?/]  with  A/'[D]  n  \  iVIy])  =  0.  Since  no 
vertex  of  D  has  a  neighbor  in  A7[a;]  \  A/’[yl,  D  is  a  component  of  G  —  N[y]. 

Now  let  Z?  C  C®  be  a  component  of  G  —  A7[y].  Then  N[D]  fl  A/[a:]  C  N[y].  □ 

Corollary  11.  Let  B  be  a  component  ofC^  -  N[y].  Then  A^(B]  n  {N[x]  \  A*[t/])  ^  0  if  and 
only  if  BC  Cyix). 

Theorem  12.  Let  B\,. . .  ,Be  denote  the  components  of  —  A^[y]  that  are  contained  in 
Cy{x).ThenI{x,y)  = 

Proof  Let  I  —  /(x,  y).  First  we  show  that  Bi  C  I  for  every  i  G  {1, . . . ,  Let  z  G  Bi. 
There  is  an  x,  z-path  avoiding  A7[y],  since  some  vertex  in  Bi  has  a  neighbor  in  N[x]  \  Ar[yJ. 
Clearly,  there  is  also  a  2:,y-path  avoiding  A/’[x],  since  z  and  y  are  both  in  C^.  This  shows 
that  z  e  1.  Consequently  Uj_j  Bi  C  I. 

Supposes  G  /\ULi  Since  z  0  Ui=!  the  component  D  of  containing 

z  does  not  contain  a  vertex  with  a  neighbor  in  A7[a:]  \  A^[y].  Thus  z  ^  C^{x),  implying  z  0  /, 
a  contradiction.  □ 


6  Computing  the  independence  number 

In  this  section  we  describe  our  algorithm  to  compute  the  independence  number  of  an  AT-free 
graph.  The  algorithm  we  propose  uses  dynamic  programming  on  intervals  and  components. 
All  intervals  and  all  components  are  sorted  according  to  nondecreasing  number  of  vertices. 
Following  this  order,  the  algorithm  determines  the  independence  number  of  each  component 
and  of  each  interval  using  the  formulas  given  in  Lemmas  13,  14  and  15. 

We  start  with  an  obvious  lemma. 

Lemma  13.  Let  G  —  {V,  E)  be  any  graph.  Then 


r{x) 

a(G)  =  l+m|x(y]«(Cf)), 

i=\ 

where  Cf , . . . ,  are  the  components  ofG  —  N[x]. 

Applying  Lemma  13  to  the  decomposition  given  by  Theorems  10  and  12,  we  obtain  the 
following  lemma. 

Lemma  14.  Let  G  —  {V,E)  be  an  AT-free  graph.  Let  x  eV  and  let  be  a  component  of 
G  -  A'fx].  Then 

a(C^)  =  1+  mgc  (Q(/(x,y))  +  y])a(Df)), 

i 

where  the  D\  '.y  are  the  components  ofG  —  A'[y]  contained  in  C^. 

Applying  Lemma  13  to  the  decomposition  given  by  Theorem  8,  we  obtain  the  following 
lemma. 
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Lemma  15.  Ut  G  =  {V,  E)  be  an  AT-free  graph.  Let  I  =  I{x,  y)  be  an  interval  ofG.  If 
/  =  0  then  a{I)  =  0.  Otherwise 

a{I)  -  1  +  max  [a{I{x,s))  +  a{I{s,y))  +  ^a(C'f)), 

sei  . 

where  the  Cf 's  are  the  components  of  G  -  N[s]  contained  in  I{x,  y). 

Remark.  Notice  that  the  components  D\  and  Cf  as  well  as  the  intervals  I (x,  s)  and  /(s,  y) 
on  the  right-hand  side  of  the  formulas  in  Lemma  14  and  Lemma  15  are  proper  subsets  of 
and  /,  respectively.  Hence  ol{C^)  (resp.  Q:(J))  can  be  computed  by  table  look-up  to 
components  and  intervals  with  a  smaller  number  of  vertices. 

Consequently  we  obtain  the  following  algorithm  to  compute  the  independence  number 
a(G)  for  a  given  AT-free  graph  G  =  (V,  E),  which  is  based  on  dynamic  programming. 

Step  1  For  every  x  eV  compute  all  components  Gf ,  Cf , . . . ,  of  G  -  iV[x]. 

Step  2  For  every  pair  of  nonadjacent  vertices  x  and  y  compute  the  interval  /(x,  y). 

Step  3  Sort  all  the  components  and  intervals  according  to  nondecreasing  number  of  vertices. 
Step  4  Compute  q;(G)  and  a{I)  for  each  component  C  and  each  interval  I  in  the  order  of 
Step  3. 

Step  5  Compute  a{G). 

Theorem  16.  There  is  an  0{n?  -  (m  -h  1))  time  algorithm  to  compute  the  independence 
number  of  a  given  AT-free  graph. 

Proof  The  correctness  of  our  algorithm  follows  from  the  formulas  of  Lemmas  13,  14  and 
1 5  as  well  as  the  order  of  the  dynamic  programming. 

We  show  how  to  obtain  the  stated  time  complexity.  Clearly,  Step  1  can  be  implemented 
such  that  it  takes  0{n(n  +  m) )  time  using  a  linear  time  algorithm  to  compute  the  components 
of  the  graph  G  ~  N[x]  for  each  vertex  x  of  G.  For  each  component  of  G  -  iV[x],  a  sorted 
linked  list  of  all  its  vertices  and  its  number  of  vertices  is  stored.  For  all  nonadjacent  vertices 
X  and  y  there  is  a  pointer  P(x,  y)  to  the  list  of  C^{y).  Thus  in  Step  2,  an  interval  /{x,  y) 
can  be  computed  using  the  fact  that  I{x,y)  =  C^{y)  n  Cy{x).  Hence  a  sorted  vertex  list 
of  /(x,  y)  can  be  computed  in  time  0{n)  for  each  interval.  Consequently  the  overall  time 
bound  for  Step  2  is  0{n  •  (m  -b  1 )).  There  are  at  most  n?  components  and  at  most  tt?  intervals 
and  each  has  at  most  n  vertices.  Thus  using  the  linear  time  sorting  algorithm  bucket  sort, 
Step  3  can  be  done  in  time  G(7i^). 

The  bottleneck  for  the  time  complexity  of  our  algorithm  is  Step  4.  First  consider  a 
component  G^  of  G  -  iV[x]  and  a  vertex  y  €  G"^.  We  need  to  compute  the  components 
of  G  -  N[y]  that  are  contained  in  G"^.  Each  component  G  of  G  -  iV[y]  except  G^(x)  is 
contained  in  G^  if  and  only  if  G  P  G"^  ^  0.  Thus  the  components  G  of  G  -  N[y]  with 
G  C  G^  are  exactly  those  components  of  G  -  N[y]  addressed  by  P{y,  z)  for  some 
Thus  all  such  components  can  be  found  in  time  G(|G^|)  for  fixed  vertices  x  and  y  €  G^, 
Hence  the  computation  of  a(G)  for  all  components  G  takes  time  E(a:,y}^£:  0(lG"'(y)|)  = 
0(n  •  (m  +  1)). 

Now  consider  an  interval  I  -  /(x,  y),  and  a  vertex  s  €  /.  We  need  to  add  up  the 
independence  numbers  of  the  components  G^  of  G  —  bl[s]  that  are  contained  in  /.  The 
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components  of  G  -  A^[t/]  that  are  contained  in  I  are  exactly  those  components  addressed  by 
P{y,  z)  for  some  z  e  I,  except  C'(x)  and  C^{y).  Thus  all  such  components  can  be  found 
in  time  0{\I{x,  t/)|)  for  a  fixed  interval  I{x,  y)  and  5  €  /(x,  y).  Hence  the  computation  of 
a(/)  for  all  inten^als  I  takes  time  Y.{r.y}^EYls£i{x,y)  +  0)- 

Clearly  Step  5  can  be  done  in  0{n-)  time.  Thus  the  running  time  of  our  algorithm  is 
0{n-  •  (m  +  1)).  □ 


7  Independent  domination 

The  approach  used  to  design  the  presented  polynomial  time  algorithm  to  compute  the 
independence  number  for  AT-free  graphs  can  also  be  used  to  obtain  a  polynomial  time 
algorithm  solving  the  INDEPENDENT  DOMINATING  SET  problem  on  AT-free  graphs.  The  best 
known  algorithm  to  solve  the  weighted  version  of  the  problem  on  cocomparability  graphs 
has  running  time  [4]. 

Definition  17.  Let  G  =  (V,  E)  be  a  graph.  Then  5  C  V  is  a  dominating  set  of  G  if  every 
vertex  of  \  5  has  a  neighbor  in  S.  A  dominating  sctS  CV  is  an  independent  dominating 
set  of  G  if  5  is  an  independent  set. 

We  denote  by  7i  (G)  the  minimum  cardinality  of  an  independent  dominating  set  of  the 
graph  G.  Given  an  AT-free  graph  G,  our  next  algorithm  computes  7i{G).  It  works  very 
similar  to  the  algorithm  of  the  previous  section. 

We  present  only  the  formulas  used  in  Step  4  and  5  of  the  algorithm  (which  are  similar  to 
those  in  Lemma  13,  Lemma  14  and  Lemma  15). 

Lemma  18.  Let  G  =  {V,  E)  be  a  graph.  Then 


r{x) 

7i{G)  =  l  +  min(^7i(C;)), 
where  Gj^,  Gf , . . . ,  tire  the  components  ofG  -  WlxJ. 

Lemma  19.  Let  G  =  {V,  E)  be  an  AT-free  graph.  Let  x  and  let  be  a  component  of 
G  -  Afx].  Then 

7i(C*)  =  1  +  min  (7i(/(i,  y))  +  y^7i{r>}')), 

j 

where  the  Dj  's  are  the  components  of  G  ~  iVft/]  contained  in  C^. 

Lemma  20.  Let  G  —  {V,  E)  be  an  AT-free  graph.  Let  I  —  /(x,  y)  be  an  inten’al.  If  I  ~  ^ 
then  7i(/)  =  0.  Otherwise 

7i(/)  =  I  +min(7i(/(a:,s))  +  7i(/(s,y))  +  y^7i(C?)), 

3 

where  the  Cj 's  are  the  components  ofG  —  Wfsj  contained  in  /(x,  y). 
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Design  and  analysis  of  the  algorithm  is  done  similar  to  the  previous  section.  We  obtain 
the  following  theorem. 

Theorem  21.  There  exists  an  0(n^  *  (m  +  1))  time  algorithm  to  compute  the  independence 
domination  number  yy  of  a  given  AT-free  graph. 

In  the  full  version  [3]  we  also  show  how  to  obtain  an  0{rr  •  (m  +  i))  algorithm  to 
compute  a  minimum  cardinality  independent  perfect  dominating  set  for  AT-free  graphs. 


8  Bounded  asteroidal  number 

In  this  section  we  show  that  the  independence  number  of  graphs  with  bounded  asteroidal 
number  can  be  computed  in  polynomial  time. 

Definition  22.  The  asteroidal  number  of  a  graph  G  is  the  maximum  cardinality  of  an 
asteroidal  set  in  G. 

Hence  a  graph  is  AT-free  if  and  only  if  its  asteroidal  number  is  at  most  two.  Furthermore 
the  asteroidal  number  of  a  graph  G  is  bounded  by  a{G),  since  every  asteroidal  set  is  an 
independent  set. 

Definition  23.  Let  Q  be  an  asteroidal  set  of  G.  The  lump  L(f2)  is  the  set  of  vertices  v  such 
that  for  d\\x  €  P  there  is  a  component  of  G  -  N[x]  containing  v  and  P\{x}. 

Let  i?  =  {xi, ...  5  Xk}  be  an  asteroidal  set  of  cardinality  k  >  2  and  consider  the  lump 
L  =  L{P). 

Let  s  be  an  arbitrary  vertex  in  L.  In  this  section  we  show  how  A^[sj  splits  the  lump 
analogous  to  Theorem  8. 

Consider  the  componentsof  G-iV[s].  These  components  partition  r2into  sets  f2i, . . . , 

where  each  is  a  maximal  subset  of  Q  contained  in  a  component  of  G  -  ^[5]. 

Lemma  24.  For  each  i  =  .  ,t,  the  set  P*  =  Pi  U  {s}  is  an  asteroidal  set  in  G. 

Proof  Considers  G  12^.  Then,  by  definition,  i7\{x}  and  s  are  contained  in  one  component 

ofG-  iV[x].  Hence,  P*  \  {x}  is  contained  in  one  component  of  G  -  Nix].  This  proves  the 
claim.  ^ 

Lemma  25.  Let  z  ^  L  be  in  some  component  C*  of  G  —  ^"[5]  that  contains  no  vertices  of 
P.  Then  C*  C  L. 

Proof  Let  p  ^  C*  \  {z}.  There  is  a  p.  z-path  avoiding  iV[x]  for  any  vertex  x  E  P.  This 
proves  the  claim.  ^ 

First  we  consider  the  case  where  r  =  1,  i.e.,  where  P  is  in  one  component  of  G  —  A^ls]. 
Then  P  U  {s}  is  an  asteroidal  set. 

Lemma  26.  IfP  is  contained  in  one  component  C  of  G  —  A^[5],  then  L{PU  {s} )  —  LOC. 
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Proof.  Clearly  L{Q  U  {5})  C  Lr\C.  Let  z  ^  LC\C  and  consider  a  vertex  x  e  Clearly, 
there  is  an  x,  z-path  avoiding  N[s],  since  z  and  x  are  in  the  component  C  of  G  ~ 

Hence  z  is  in  the  component  of  i7  of  G  —  N[s].  Consider  any  other  vertex  y  e  Q.  (Such 
vertices  exist  since  |i?|  >  2).  There  exists  a  z,  t/-path  avoiding  A^[x]  since  z  6  L,  But  also, 
there  exists  a  y,  s-path  avoiding  N[xj  since  J?  U  {s}  is  an  asteroidal  set.  Hence  z  is  in  the 
component  of  (i?  U  {s})  \  {x}  of  G  -  iV[x].  □ 

Now  we  consider  the  case  where  r  >  1.  Let  Li  —  L(/?iU{s})  forz  =  1, ....  r.  Clearly, 
Li  n  Lj  =  0  for  every  i  7^  j. 

Lemma  27.  Assume  r  >  1  and  let  C  be  the  component  of  G  —  A^[s]  containing  f?i.  Then 
Li=^  Lnc. 

Proof  First  let  z  6  L  n  C7.  Then  for  all  x  and  y  in  i?i  there  is  a  z,  x-path  avoiding  N[s]  since 
z  e  C  (showing  that  z  and  are  in  one  component  of  G  —  N[s]),  and  there  is  a  z,  x-path 
avoiding  A'[t/]  since  z  ^  L.  For  y'  G  Qj  for  any  j  ^  i  there  is  a  z,  r/'-path  avoiding  iV[x], 
since  z  E  L.  Such  a  path  contains  a  vertex  of  Nls],  and  consequently  there  is  a  z,  s-path 
avoiding  A^[x].  This  shows  that  z,  s  and  \  {x}  are  in  one  component  of  G  -  iV[xj  and 
hence  L  D  C  C  Li. 

Now  let  z  6  Li.  This  clearly  implies  z  G  G.  For  a  vertex  y  E  Qjy  j  ^  i,  s  and  the  set 
Q  \  {2/}  are  in  one  component  of  G  —  iVfy]  since  s  €  L.  There  is  an  s,  z-path  avoiding  A^[?/] 
since  y  and  z  belong  to  different  components  of  G  —  N[s].  Consequently,  z  and  Q  \  {t/}  are 
in  one  component  of  G  — 

For  a  vertex  x  G  Qu  there  is  a  component  of  G  -  A^[xj  containing  s  and  Q  \  {x},  since 
s  E  L.  Since  z  G  Li,  there  is  an  s,  z-path  avoiding  A^[x].  Hence  also  z  is  in  this  component 
of  G  -  A'fx]  and  therefore  Li  C  Lr\C.  □ 

Theorem  28.  There  exist  components  C\,. . .  ,Ct  ofG  —  ATIs]  which  contain  no  vertex  of 
Q  such  that 

t  T 

L\N[s\  =  \JCiV[jLi. 

i=I 

Proof  Let  Gj , . . . ,  G*  be  the  components  of  G  —  N[s]  which  contain  a  vertex  of  L  but  no 
vertex  of  Q.  Then  by  Lemma  25  we  have  (Ji=i  Gi  C  L\N[s],  and  by  Lemmas  26  and  27 
we  have  ljj=i  Lj  C  L  \  Nfs]. 

Now  let  /  G  L\  N[s].lf  I  is  in  a  component  containing  1  <  2  <  r,  then  I  E  Li  by 
Lemma  26  or  27.  Otherwise  there  is  an  index  iy  I  <  i  <  t  such  that  I  E  Ci.  This  completes 
the  proof.  □ 

Theorem  28  enables  us  to  generalize  Lemmas  15  and  20  in  the  following  way. 

Lemma  29.  Let  L  =  L(i7)  be  a  lump  ofG.  IfL  =  0  then  a{L)  =  7i{L)  =  0.  Otherwise 

t  T 

a(L)  =  l  +  max(y^a(Q)  +  y^Q(Z,i)), 

^  j=]  i=l 

t  T 

7i(i)  =  1  +  min  (^7i(C,)  +  ^7i(ifc)), 

A:=I 
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where  Cu  ...  ,Ct  are  the  components  ofG  —  iV[s]  which  contain  no  vertex  of  Q,  L], . . . ,  Lr 
are  the  lumps  L{Qi  +  s)  as  used  in  Lemma  24. 

Together  with  Lemmas  13  and  14,  18  and  19,  the  formulas  of  Lemma  29  lead  to 
recursive  algorithms  computing  q:(G)  and  7i(^)  ^  graph  G.  For  any  positive  integer  k, 

these  algorithms  can  be  implemented  to  run  in  time  0(n^+^)  for  all  graphs  with  asteroidal 
number  at  most  k.  Analogously  to  the  proof  of  Theorem  16,  the  time  complexity  is  now 
dominated  by  the  term  UseLiO)  0{\L{Q)\)  =  where  the  sum  is  taken  over 

all  asteroidal  sets  G  of  G  and  all  s  €  L(i7). 

As  before,  our  algorithms  for  graphs  with  a  bounded  asteroidal  number  can  be  extended 
to  the  weighted  cases  of  the  problems  and  the  corresponding  algorithms  have  the  same 
timebounds. 


9  Conclusions 

In  this  paper  we  have  shown  that  the  independence  number  as  well  as  the  independence 
domination  number  of  an  AT- free  graph  can  be  computed  in  time  0(n-  •  (m  -I- 1 )).  The  same 
approach  can  be  used  to  obtain  an  0{n^  ■  (m  -!-  1))  algorithm  to  solve  the  INDEPENDENT 
PERFECT  DOMINATING  SET  problem  on  AT-free  graphs.  We  have  shown  how  to  adapt  the 
algorithm  computing  the  independence  number  in  such  a  way  that  the  new  algorithm  com¬ 
putes  the  independence  number  of  a  graph  with  a  bounded  asteroidal  number  in  polynomial 
time. 

In  the  full  version  [3]  we  show  how  to  extend  our  algorithms  for  the  problems  INDEPEN¬ 
DENT  SET  and  INDEPENDENT  DOMINATING  SET  to  AT-free  graphs  with  real  vertex  weights. 
Both  algorithms  run  in  time  0{ii?  ■  {fn  +  1)).  Furthermore  our  algorithms  can  also  be  mod¬ 
ified  such  that  they  compute  a  maximum  weight  independent  set  and  a  minimum  weight 
independent  dominating  set  in  time  0{n?  •  (m  +  1)). 

Contrary  to  the  independent  set  problems  considered  so  far,  the  NP-complete  graph 
problems  CLIQUE  and  PARTITION  INTO  CLIQUES,  that  are  closely  related  to  INDEPENDENT 
SET,  both  remain  NP-complete  when  restricted  to  the  class  of  AT-free  graphs.  Concerning 
CLIQUE  recall  that  Poljak  has  shown  that  INDEPENDENT  SET  remains  NP-complete  on  triangle- 
free  graphs  [9].  Consequently  CLIQUE  remains  NP-complete  on  graphs  with  independence 
number  at  most  two,  and  thus  on  AT-free  graphs.  Similarly,  it  follows  from  a  recent  result 
due  to  Maffray  and  Preissman  (showing  that  GRAPH  A:-C0L0RABILITY  remains  NP-complete 
when  restricted  to  triangle-free  graphs  [22]),  that  the  problem  PARTITION  INTO  CLIQUES 
remains  NP-complete  on  AT-free  graphs. 

Consequently  CLIQUE  and  PARTITION  INTO  CLIQUES  are  the  first  NP-complete  graph  prob¬ 
lems  (known  to  us)  which  are  NP-complete  on  AT-free  graphs,  but  solvable  in  polynomial 
time  on  the  class  of  cocomparability  graphs.  The  latter  graph  class  is  the  largest  well-studied 
subclass  of  AT-free  graphs  which  is  also  a  class  of  perfect  graphs. 

It  would  be  interesting  to  find  out  the  algorithmic  complexity  of  the  following  well-known 
NP-complete  graph  problems  when  restricted  to  AT-free  graphs:  GR.APH  fc-COLORABiLiTY, 
HAMILTONIAN  CIRCUIT,  HAMILTONIAN  PATH.  These  three  problems  are  all  known  to  have 
polynomial  time  algorithms  for  cocomparability  graphs  [8,  10]. 
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Abstract.  In  the  context  of  Cousot  and  Cousot’s  abstract  interpreta¬ 
tion  theory,  we  present  a  general  framework  to  define,  study  and  handle 
operators  modifying  abstract  domains.  In  particular,  we  introduce  the 
notions  of  operators  of  refinement  and  compression  of  abstract  domains. 

A  refinement  enhances  the  precision  of  an  abstract  domain;  a  compres¬ 
sion  operator  (compressor)  can  exist  relatively  to  a  given  refinement,  and 
it  simplifies  as  much  as  possible  a  domain  of  input  for  that  refinement. 

The  adequateness  of  our  framework  is  shown  by  the  fact  that  most  of  the 
existing  operators  on  abstract  domains  fall  in  it.  A  precise  relationship  of 
adjunction  between  refinements  and  compressors  is  also  given,  justifying 
why  compressors  can  be  understood  as  inverses  of  refinements. 

1  Introduction 

It  is  well  known  that  abstract  domains  play  a  fundamental  role  in  abstract  inter¬ 
pretation  [5,  6],  since  the  precision  of  an  abstract  interpretation-based  program 
analysis  strongly  depends  on  the  expressive  power  of  the  chosen  abstract  do¬ 
main.  Much  work  has  been  therefore  devoted  to  define  systematic  operators  for 
enhancing  the  precision  of  representation  of  abstract  domains.  Relevant  examples 
are  Cousot  and  Cousot’s  reduced  product,  disjunctive  completion  and  reduced 
cardinal  power  [6],  Nielson’s  tensor  product  [18],  Giacobazzi  and  Ranzato’s  de¬ 
pendencies  and  dual-Moore-set  completion  [13],  the  open  product  and  pattern 
completion  of  Cortesi  et  al.  [4],  to  cite  the  most  known  ones.  The  basic  idea  is 
that  richer  abstract  domains  can  be  obtained  by  combining  simpler  ones  or  by 
lifting  them  by  adding  new  information.  These  operators  on  abstract  domains 
provide  high  level  facilities  to  tune  the  analysis  in  accuracy  and  cost,  and  some 
of  them  have  been  included  as  tools  for  abstract  domain  design  aid  in  modern 
systems  for  program  analysis,  like  for  instance  in  System  Z  [22]  and  in  PLAI  [1]. 

We  carry  on  this  idea  of  operators  enhancing  the  precision  of  abstract  do¬ 
mains  and  we  present  in  Sect.  3  a  general  and  precise  framework  to  handle  these 
operators,  which  encompasses  and  improves  the  ideas  sketched  in  [9].  The  cen¬ 
tral  notion  is  that  of  abstract  domain  refinement,  that  intuitively  is  any  operator 
performing  an  action  of  refinement  on  abstract  domains,  with  respect  to  their 
standard  ordering  relation  of  precision.  There  exists  a  strong  link  between  refine¬ 
ments  and  closure  operators,  and  many  lattice-theoretic  properties  of  closures 
are  inherited  by  refinements.  We  introduce  a  generic  pattern  of  definition  for 
domain  refinements,  which  allows  to  recover  most  of  the  important  refinements 
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listed  above.  Moreover,  as  an  instance  of  this  scheme,  we  present  a  new  refine¬ 
ment  of  completeness.  Roughly  speaking,  an  abstract  domain  D  is  complete  for 
a  semantic  function  /  defined  on  the  concrete  domain  when  no  loss  of  precision 
is  introduced  by  approximating  /  in  the  best  possible  way  (i.e.  by  considering  its 
best  correct  approximation,  cf.  [5,  6])  with  respect  to  D.  Thus,  for  a  domain  D, 
our  refinement  of  completeness  provides  the  most  abstract  domain  which  is  more 
precise  than  D  and  complete  for  a  given  continuous  concrete  semantic  function. 

Recently,  also  operators  of  simplification  of  abstract  domains  have  been  de¬ 
fined  and  studied,  like  the  operations  of  complementation  in  [3]  and  least  dis¬ 
junctive  basis  in  [14].  As  well  as  refinements,  we  show  in  Sect.  4  that  these 
operators  can  be  expressed  in  a  formal  and  precise  way  in  our  framework.  Ac¬ 
tually,  these  operators  are  instances  of  our  notion  of  operator  of  compression 
(or  compressor).  Roughly  speaking,  for  a  given  abstract  domain  refinement 
its  relative  compressor  simplifies  a  domain  D  of  input  for  9^,  by  returning  the 
domain  (if  this  exists)  which  contains  the  least  amount  of  information  required 
as  input  by  to  reach  the  same  enhancement  obtainable  from  D.  This  is  some¬ 
how  similar  to  the  operation  of  compression  on  files  -  hence  our  terminology. 
In  more  precise  terms,  if  9^  is  a  unary  refinement  and  D  is  an  abstract  domain, 
then  an  abstract  domain  A  is  the  optimal  basis  of  D  for  9?,  if  A  is  the  most 
abstract  solution  to  the  equation  9R(X)  =  9^(T)).  Obviously,  if  an  optimal  basis 
exists  then  it  is  necessarily  unique.  We  say  that  9?  is  invertible  on  a  given  class 
of  abstract  domains  if  there  exists  the  optimal  basis  of  any  domain  D  in  the 
class.  In  this  case,  the  compressor  9f^“  relative  to  9J  (also  called  the  inverse  of 
9?)  provides  the  optimal  basis  ^~{D)  of  D  for  9?.  The  problem  of  inverting  a 
refinement  is  often  hard  to  solve  in  a  satisfactory  way,  and,  in  general,  not  all 
domain  refinements  admit  a  corresponding  compressor  defined  for  a  significant 
class  of  abstract  domains.  We  show  that  complementation  and  least  disjunctive 
basis  give  rise,  respectively,  to  the  compressors  relative  to  reduced  product  and 
disjunctive  completion  refinements,  and  we  give  a  generic  scheme  for  defining 
invertible  refinements.  Moreover,  we  show  that  invertible  refinements  provide 
solutions  to  the  problem  of  decomposing  abstract  domains  into  simpler  factors. 
If  9f?  is  an  n-ary  refinement  and  D  —  9^(Di then  the  tuple  {Di , ,Dn) 
can  be  considered  as  a  decomposition  of  D  relative  to  9^.  We  then  present  a 
general  iterative  method  which  starting  from  any  decomposition  relative  to  an 
invertible  refinement  provides  minimal  decompositions,  i.e.  decompositions  in¬ 
volving  the  most  abstract  factors. 

It  is  important  to  note  that  our  notion  of  inversion  of  a  refinement  does  not 
correspond  to  the  more  customary  inversion  in  the  sense  of  adjunctions  -  on 
the  contrary,  we  observe  that,  in  general,  this  is  not  possible.  However,  we  show 
in  Sect.  5  that  this  asymmetry  can  be  overcome  by  considering  a  modified  or¬ 
dering  relation  between  abstract  domains,  that  is  induced  in  a  natural  way  by 
the  refinement  itself.  We  prove  that  for  this  lifted  order  on  abstract  domains, 
an  invertible  refinement  and  its  compressor  do  constitute  an  adjunction.  This 
provides  a  firm  mathematical  relationship  between  refinements  and  compressors, 
and  gives  a  more  precise  justification  to  the  use  of  the  term  “inverse” . 
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2  Preliminaries 

The  structure  {?/co((7),  □,  U,  n,  Ax.T,  Aa;.a;)  denotes  the  complete  lattice  of  all 
upper  closure  operators  (shortly  closures)  on  a  complete  lattice  (C,  <,  V,  A,  T,  _L), 
where  p  □  77  iff  V.t  €  C.  p{x)  <  r]{x).  The  complete  lattice  of  all  lower  closure 
operators  on  C  is  denoted  by  lco{C)  and  is  dual-isomorphic  to  uco{C).  Recall 
that  each  closure  operator  p  e  uco{C)  is  uniquely  determined  by  the  set  of  its 
fixpoints,  which  is  its  image,  i.e.  p{C)  =  {x  ^  C  \  p{x)  =  a;},  that  p  C  77  iff 
'n{C)  ^  that  a  subset  X  C  C  is  the  set  of  fixpoints  of  a  closure  iff 

X  ==  {AF  I  F  C  X}  (note  that  T  G  X).  (p(C'),  <)  is  a  complete  meet  subsemi¬ 
lattice  of  C  but,  in  general,  it  is  not  a  complete  sublattice  of  C . 

In  the  standard  Cousot  and  Cousot  abstract  interpretation  theory,  abstract 
domains  can  be  equivalently  specified  either  by  Galois  connections  or  by  closure 
operators  [6].  In  the  first  case,  concrete  and  abstract  domains  are  related  by 
a  pair  of  adjoint  functions.  This  provides  a  way  to  relate  domains  containing 
objects  having  different  representation.  In  the  second  case  instead,  an  abstract 
domain  is  specified  as  (the  set  of  fixpoints  of)  an  upper  closure  on  the  concrete 
domain.  Thus,  the  closure  operator  approach  is  particularly  convenient  when 
reasoning  about  properties  of  abstract  domains  independently  from  the  repre¬ 
sentation  of  their  objects,  as  in  our  case.  Hence,  we  will  identify  uco{C)  with 
the  complete  lattice  of  all  possible  abstract  domains  of  the  concrete  domain  (i.e. 
any  complete  lattice)  C.  The  ordering  on  uco{C)  corresponds  precisely  to  the 
standard  order  used  in  abstract  interpretation  to  compare  abstract  domains  with 
regard  to  their  precision;  Di  is  more  precise  than  D2  iff  Di  C  D2  in  uco{C)  (C 
denotes  strict  ordering).  The  lub  and  gib  on  uco{C)  have  therefore  the  following 
meaning  as  operators  on  domains.  Suppose  {A}ie/  Q  uco{C):  (i)  Ui^jDi  is  the 
most  concrete  among  the  domains  which  are  abstractions  of  all  the  A’s,  i.e.  it 
is  their  least  common  abstraction;  (ii)  Hi^iDi  is  (isomorphic  to)  the  well-known 
reduced  product  of  all  the  A’s,  and,  equivalently,  it  is  the  most  abstract  among 
the  domains  (abstracting  C)  which  are  more  concrete  than  every  Di.  Whenever 
C  is  a  meet-continuous  complete  lattice  (i.e.,  for  any  chain  Y  C  C  and  x  E  C: 
a;  A  (VF)  =  A  y)),  uco{C)  enjoys  the  lattice-theoretic  property  of  pseu- 

docomplementedness  (cf.  [12]).  This  property  allowed  to  define  the  operation  of 
complementation  of  abstract  domains  (cf.  [3]),  namely  an  operation  which,  start¬ 
ing  from  any  two  domains  C  Q  D,  where  C  is  meet-continuous,  gives  as  result 
the  most  abstract  domain  C ^D,  such  that  {C D)  n  D  =  C. 

3  Abstract  Domain  Refinements 

Intuitively,  an  abstract  domain  refinement  is  an  operator  that,  for  any  tuple 
of  domains  of  input  (ranging  on  a  given  domain  of  definition),  provides 
as  output  a  domain  more  precise  than  each  Di.  It  is  also  very  reasonable  to 
expect  that  such  an  operator  is  monotone.  These  observations  naturally  lead  to 
the  definition  below.  In  the  following,  a  generic  tuple  of  objects  is  denoted  by 
O,  7ri(0)  denotes  its  i-th  component,  and  0[X/7]  denotes  the  tuple  obtained 
from  O  by  replacing  7ri(0)  with  X.  Also,  G  is  a  complete  lattice  acting  as 
the  concrete  domain  and  U  C  uco{C)^ ^  n  >  1,  is  a  given  tuple  of  sets  of 
domains  abstracting  C  (for  simplicity,  we  only  consider  refinements  of  finite  arity 


774 


-  actually  those  having  a  practical  meaning  -  although  a  generalization  would  be 
straightforward).  When  n  ^  1  we  denote  U  as  the  set  U  C  uco{C).  We  extend 
on  tuples  the  gib  of  uco{C):  For  any  tuple  of  domains  D,  nD  =  ni<^<n7ri(D). 
Definition  3.1  A  map  ^:V  wco(C)  is  a  {n-ary  abstract  domain)  refinement 
if:  (i)  ^  is  monotone;  (ii)  5R  is  reductive:  VD  E  U.  3f?(D)  C  nD.  □ 

The  kernel  of  definition  of  any  refinement  :  U  nco(  (7)  is  given  by 
IKu  =  ni<i<n7ri(U).  Often,  refinements  are  defined  on  any  tuple  of  abstract 
domains,  he.;  5R  :  uco{C)^  uco{C),  as  in  the  case  of  reduced  product  and 
disjunctive  completion,  later  considered.  We  will  call  them  full  refinements,  in 
order  to  distinguish  them  from  generic  ones  as  allowed  by  Definition  3.1.  Any 
n-ary  refinement  :  U  ->  nco(C)  induces  a  family  of  refinements  of  lower  arity 
obtained  by  fixing  some  of  the  domains  of  input.  For  instance,  by  fixing  n  ~  1 
domains,  we  get  the  unary  refinements  AX.3^(D[X/i])  :  7ri(U)  ->  uco{C).  Also, 
^  induces  the  canonical  unary  self-refinement  :  Ku  ^  uco{C)  defined  as 
9f?(D, ...,  D).  Conversely,  any  n-uple  R  =  (^i)i<i<n  of  unary  refine¬ 
ments  ^i’.  U^  uco{C)  induces  an  n-ary  refinement  :  UiX,..xUn  uco{C) 
defined  as  ^r(D)  =  (D)),  and  called  attribute  independent. 

It  is  important  to  remark  that  Definition  3.1  lacks  of  any  requirement  of 
idempotence.  For  instance,  for  a  unary  refinement  ^  :  U  uco{C)  may  well 
happen  that  a  refined  domain  E  U  can  still  be  object  of  further  refinement, 
i.e.  ^{^{D))  \Z  9?(D).  Due  to  lack  of  space,  in  the  paper  we  will  only  consider 
examples  of  idempotent  refinements,  although  a  relevant  example  of  nonidem- 
potent  refinement  can  be  given  by  the  dependencies  between  abstract  domains 
of  [13].  However,  it  is  worth  noting  that,  by  monotonicity,  any  refinement  can 
be  lifted  to  an  idempotent  one  as  the  limit  of  a  possibly  transfinite  Kleene  fix- 
point  iteration  sequence.  It  is  therefore  reasonable  requiring  idempotence  for 
refinements,  i.e.  that  a  refinement  upgrades  abstract  domains  all  at  once. 

Definition  3.2  An  n-ary  refinement  9?  :  U  ->  uco{C)  is  idempotent  if  for  any 
i  E  [1,  n]  and  D  E  U  such  that  9?(D)  E  IKu,  ?R(D)  =  9^(D[9?(D)/f]).  Q 

Proposition  3.3  For  any^:U  uco{C),  the  following  are  equivalent: 

(a)  9?  is  idempotent; 

(b)  For  any  D  E  U  such  that  9?(D)  E  IK^j,  9^(D)  —  9^ii(9i(D)), 

//Ku  is  a  (finitely)  meet  sub  semilattice  of  uco{C)  then  (a)  is  equivalent  to: 

(c)  9^1  is  idempotent  and  for  anyDeV  such  that  ^(D)  elK^ ,  ^(D)  =  ^i{niy) . 
The  following  example  yields  a  generic  and  useful  pattern  of  definition  for 

full  idempotent  refinements. 

Example  3.4  Consider  any  property  P  of  abstract  domains,  i.e.  a  subset  of  the 
lattice  of  abstract  interpretations  P  C  uco{C).  For  any  fixed  n  E  IN,  define  the 
operator  9^p  :  uco{C)'^  uco{C)  as  9?p  =  AD.  U  {A  E  uco{C)  \  A  e  P,  A  Q 
nD}.  Thus,  9?p(D)  is  the  least  common  abstraction  of  all  domains  that  satisfy 
P  and  are  more  concrete  (viz.  precise)  than  every  '7ri{D)  for  i  E  [l,n].  It  is 
immediate  to  observe  that  9^p  is  monotone  and  reductive.  Also,  it  is' easily  seen 
that  9?p  satisfies  the  condition  (c)  of  Proposition  3.3.  Thus,  9^p  always  defines  a 
full  idempotent  refinement.  However,  in  general,  9?p(D)  may  not  satisfy  P.  On 
the  other  hand,  the  following  characterization  holds. 
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Proposition  3.5  VD,  3^p(D)  6  P  ^  P  €  lco{uco{C))  =>  9^p  —  AD.P(nD), 

Thus,  for  a  property  P  which  is  a  lower  closure,  5ftp(D)  is  the  most  abstract 
domain  which  satisfies  P  and  is  more  concrete  than  every  7ri(D),  or,  equivalently, 
5^p(D)  is  the  least  extension  of  nD  that  satisfies  P.  It  is  also  worth  noting  that 
3?p(D)  is  the  greatest  fixpoint  of  the  equation  X  =  P{X)  □  (HD)  in  uco{C).  □ 

Note  that  any  unary  idempotent  refinement  ^  :  U  ^  uco{C)  such  that 
^{U)  C  U  (we  say  in  this  case  that  ^  is  well-defined  on  U)  actually  is  di.  lower 
closure  operator  on  the  poset  (P,  C),  with  the  order  inherited  from  uco(C),  i.e. 
^  €  lco{U).ln  particular,  any  unary  full  idempotent  refinement  is  a  lower  clo¬ 
sure  on  uco{C),  i.e.  ^  €  lco{uco{C)),  a  case  already  considered  in  [9].  Also,  for 
any  n-ary  full  idempotent  refinement  ;  uco{C)^  uco{C),  we  have  that  any 
unary  refinement  AX.3^(D[X/2])  {i  E  [1,  n])  induced  by  3^  is  a  lower  closure  op¬ 
erator  on  uco{C),  as  well  as  the  self-refinement  It  would  be  straightforward, 
although  notationally  tedious,  to  generalize  this  latter  observation  to  generic 
n-ary  (possibly  nonfull)  idempotent  refinements  that  satisfy  a  suitably  general¬ 
ized  condition  of  well-definedness.  These  observations  are  fairly  important,  since 
unary  idempotent  refinements  inherit  all  the  lattice-theoretic  properties  of  lower 
closures  (see  [23]  for  a  few  of  them).  For  instance,  whenever  the  domain  of  defini¬ 
tion  ( P,  C)  is  a  complete  lattice,  we  get  that  these  refinements  well-defined  on  U 
form  a  complete  lattice  {lco{  P),  C)  (by  a  slight  abuse  of  notation,  we  always  use 
the  ordering  symbol  C  for  any  kind  of  closures),  where  3^i  C  3^2  il?  for  any  A  E  P, 
3^1  (A)  C  51?2(A)  iff  the  set  of  abstract  domains  refined  by  is  contained  in  the 
set  of  those  refined  by  31^2.  Thus,  analogously  to  the  case  of  abstract  domains, 
the  complete  ordering  C  between  idempotent  refinements  can  be  interpreted  as 
a  relation  of  precision  among  refinement  operators,  where  is  more  precise 
than  3^2  iff  ^3^2-  Moreover,  any  unary  idempotent  refinement  well-defined 
on  a  complete  subsemilattice  P  of  uco{C)  enjoys  the  following  properties  of 
compositionality  w.r.t.  the  reduced  product  and  least  common  abstraction. 

Proposition  3.6  If  U  is  a  complete  meet  (join)  subsemilattice  of  uco{C),  3?  : 
U  ^  U  is  an  idempotent  refinement,  and  {A}i6/  Q  = 

3?(n,e/3^(A))  (^(u,e/3f^(A))  -  u^G/^(A)j- 

Reduced  Product  Refinement.  The  simplest  and  probably  most  familiar  ex¬ 
ample  of  abstract  domain  refinement  is  the  reduced  product  [6],  which  is  the  gib 
in  the  lattice  of  abstractions.  For  simplicity,  we  consider  it  as  a  binary  refinement. 
For  any  fixed  concrete  domain  C  (i.e.,  any  complete  lattice),  reduced  product  is 
obviously  an  idempotent  full  refinement  :  uco{C)  xuco{C)  uco{C).  Thus, 
the  unary  refinement  induced  by  3f^n5  i-6.  AX.  A  n  X  is  a  lower  closure,  and  for 
it  the  properties  discussed  above  hold.  It  is  worth  noting  that  is  the  simplest 
instance  of  the  family  of  refinements  defined  in  Example  3.4,  since  is  3^p  for 
the  trivial  property  P  =  uco{C).  Also,  3^n  is  the  attribute  independent  combi¬ 
nation  of  the  trivial  identity  refinements.  Reduced  product  has  been  successfully 
applied  as  a  domain  refinement  in  program  analysis  e.g.  in  [1,  16,  21]. 

Disjunctive  Completion  Refinement.  The  disjunctive  completion  [6]  en¬ 
hances  an  abstract  domain  so  that  its  disjunction  operation  (i.e.  lub)  becomes 
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precise  (as  that  of  the  concrete  domain).  Abstract  domains  with  a  precise  dis¬ 
junction  (also  called  disjunctive  abstract  domains)  correspond  to  additive  clo¬ 
sure  operators.  Disjunctive  completion  can  be  given  as  an  instance  of  the  gen¬ 
eral  scheme  of  Example  3.4,  where  the  property  P  is  given  by  additivity:  P  - 
uco^(C),  the  subset  of  uco{C)  of  additive  closures.  Hence,  the  disjunctive  com¬ 
pletion  :  uco{C)  uco{C)  is  defined  as  ^  ?ico“(C')  |  A  C  D}. 

Thus,  is  an  idempotent  full  refinement.  It  is  easy  to  observe  that  uco^{C) 
defines  a  lower  closure  on  uco{C).  Then,  by  Proposition  3.5,  is  the  most 

abstract  disjunctive  domain  that  is  more  concrete  than  D,  The  disjunctive  com¬ 
pletion  refinement  has  been  applied  in  program  analysis  e.g.  in  [8,  15,  10]. 

Negative  Completion  Refinement.  Assume  the  concrete  domain  C  be  a 
complete  Boolean  algebra.  It  is  easy  to  verify  that  if  p  G  uco^{C)  then  = 
{^x  e  C  I  X  e  p}  6  uco^{C),  The  negative  completion  refinement  is  then 
defined  on  disjunctive  abstract  domains,  :  ?/co“(C)  ->  uco{C),  as  follows: 
$P-^{A)  =  A  n  A.  Thus,  3?-.  lifts  a  given  disjunctive  abstract  domain  A  to  the 
reduced  product  of  A  with  its  negative  abstract  domain,  namely  to  the  most 
abstract  domain  containing  both  A  and  ->A.  It  is  now  simple  to  check  that 
3?-,  :  uco^{C)  uco{C)  is  an  idempotent  refinement.  It  is  worth  noting  that,  in 
general,  3?-.(A)  may  not  be  disjunctive  (i.e.,  3?^  is  not  well-defined  on  uco^{C)). 

The  Refinement  of  Completeness.  Abstract  interpretation  is  intended  to 
create  sound  approximations  of  the  concrete  semantics  of  programs.  If  the  pro¬ 
gram  semantics  is  specified  as  the  least  fixpoint  of  a  monotone  semantic  op¬ 
eration  /  :  C  C  on  a  complete  lattice  C,  then,  in  the  closure  operator 
approach,  the  soundness  criterion  for  an  abstract  domain  given  by  p  G  uco{C) 
and  for  an  abstract  monotone  semantic  operation  :  p((7)  p(C),  is  Vc  6 

C.  p(/(c))  <  /**(p(c)).  This  ensures  the  global  soundness  of  the  abstract  se¬ 
mantics,  i.e.  p{lfp{f))  <  (cf.  [5]).  Completeness  is  the  dual  relation 

Vc  e  C.  /^(p(c))  <  p{/(c)).  Because  soundness  is  always  required  in  abstract 
interpretation,  in  the  following  we  abuse  terminology  and  say  that  is  complete 
for  /  if  p  o  /  /«  o  p.  In  this  case  p{lfp(f))  =  Completeness  in  abstract 

interpretation  is  a  quite  rare  ideal  situation,  where  for  a  given  abstract  domain 
no  loss  of  precision  is  introduced  by  abstract  semantic  operations.  Completeness 
is  especially  recurrent  between  (concrete)  semantics  of  programming  languages 
(cf.  [2,  7,  11]).  Issues  of  completeness  and  related  notions  have  also  been  studied 
in  [17,  19,  20].  Completeness  can  be  made  a  property  of  abstract  domains,  by 
making  this  notion  independent  on  the  choice  for  fK  Recall  that  the  best  correct 
approximation  of  /  w.r.t.  p  is  given  by  po/  :  p{C)  — >  p((7).  Thus,  we  consider 
completeness  of  the  best  correct  approximation:  p  G  uco{C)  is  complete  for  /  if 
po  f  —  po/op.  For  example,  let  us  consider  the  canonical  4-point  abstract  do¬ 
main  Sign  —  {0,  Z<o,  ^>o,  2},  which  is  an  obvious  abstraction  of  {p(2),  C).  It  is 
simple  to  show  that  Sign  is  complete  for  the  monotone  operation  of  integer  mul¬ 
tiplication  XX.n-X  :  p(2)  ->  p(2)  (where  n  £  H.  and  n-X  =  {n-m  |  m  G  A}). 
On  the  other  hand,  p  =  {2>o,  Z}  (with  Sign  □  p)  is  not  complete  for  XX.  n  •  X 
with  n  <  0:  In  fact,  e.g.,  p(n  •  {—3})  ==  Z>o,  but,  because  p({“3})  =  Z, 
p{n  •  p({-3}))  =  p(Z)  -  Z.  The  property  of  completeness  for  a  semantic  func- 
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tion  /  is  therefore  given  by  r(/)  =  {p  6  uco{C)  \  =  Following  the 

scheme  of  Example  3.4,  we  can  define  an  idempotent  full  refinement  of  complete¬ 
ness  9^r(/)  ■  uco{C)  — )■  uco{C)  as  =  U{rf  £  uco{C)  \  rj  €  r(/),  rj  C  p). 

Theorem  3.7  If  f  is  continuous  then  r(/)  €  lco{uco{C)). 

Thus,  by  Proposition  3.5,  we  have  that  for  a  continuous  /,  ^Tif){D)  actually  is 
the  (unique)  most  abstract  domain  which  includes  D  and  is  complete  for  / .  For 
instance,  it  is  possible  to  check  that  for  n  <  0,  3Rr(AX.n  X){{2>05  2})  =  Sign. 

4  Abstract  Domain  Compressors 

We  have  introduced  the  notion  of  abstract  domain  refinement  as  a  formalization 
(and  generalization)  of  many  existing  operators  devoted  to  enhance  the  expres¬ 
siveness  of  abstract  domains.  However,  no  operator  performing  a  dual  action  of 
simplification  on  abstract  domains  has  been  proposed  up  till  now.  We  now  for¬ 
malize  the  idea  of  a  simplifying  operator  that  gives  as  input  to  a  fixed  refinement 
the  simplest  domains  (i.e.  most  abstract)  which  can  be  object  of  that  refine¬ 
ment.  Let  ;  U  ->  uco{C)  be  a  (possibly  nonidempotent)  refinement.  Define 

;  U  ^  uco{C),  k  e  [1,  n],  as  =  AD.U{^  G  nk{V)  \  ^(D[A/k])  =  3^^(D)}. 
For  D  G  U,  3^^(D)  is  the  least  common  abstraction  of  all  domains  in  7rjfc(U) 
that,  when  substituted  to  7rA:(D)  as  k-th.  input  for  do  not  change  the  output. 

Definition  4.1  3^j^(D)  is  the  k-th  optimal  basis  of  D  G  U  for  3R  if  3R^(D)  G 
7rjt(U)  and  5P(D)  =  3?(D  [?ft^(D)/A;]).  The  refinement  5R  is  k-invertible  (or  admits 
the  k-th  inverse)  on  V  C  7rjt(U)  iffor  allD  G  IJ[V /k],  ^^(0)  is  the  ^-th  optimal 
basis  of  D  for  9^.  When  3^  is  A;-invertible,  the  map  3?^  :  JJ[V /k]  7r;^(U)  is 
called  the  k-th  compressor  for  D 

Note  that  if  the  domain  of  definition  V  C  tt^CU)  of  the  k-th  compressor  is 
a  complete  join  subsemilattice  of  uco{C),  then  the  condition  3^i^(D)  G  7rA;(U)  in 
the  above  definition  can  be  omitted.  For  K  C  [1,  n],  we  say  that  is  iL-invertible 
on  a  |iL|-tuple  V,  where  Vi  G  K.  7ri(V)  C  7ri(U),  if  it  is  A;-invertible  on  7rA;(V), 
for  any  A:  G  In  particular,  ^  is  fully  invertible  on  V  C  U  if  it  is  [l,n]- 
invertible  on  V.  For  the  simpler  case  of  a  unary  refinement  ^  :  U  ^  uco{C)^  we 
have  that  :  U  ^  uco{C)  is  defined  as  9f?“(jD)  =  U{A  G  U  |  3f^(7i)  =  3^(D)}, 
and  ^  is  invertible  on  F  C  ?7  iff  for  any  D  G  F,  ^{U~{D))  =  R{D).  It  is 
simple  to  observe  that  the  above  definition  of  A:-invertibility  can  be  formulated 
by  using  the  unary  refinements  induced  by  a  (n-ary)  refinement.  More  precisely, 
if  :  U  uco{C)  is  a  refinement,  then  we  have  already  seen  that  for  any 
k  G  [l,n]  and  7ri(D)  G  7rt(U)  {i  k),  AA’.3?(D[X/A;])  :  7rfc(U)  uco{C)  is  a 
unary  refinement.  It  is  then  easily  seen  that  9^  is  A:-invertible  in  F  C  7rjt(U)  iff 
AA.9f?(D[A/A:])  is  (1-) invertible  on  F  C  7rA;(U).  In  this  case,  for  the  compressor 
(AX.9f?(D[X /k]))~  :  V  7rk{V)  and  the  A;-th  compressor  9R^  of  9f?,  the  following 
mutual  equality  result  holds:  VD  G  F.  (AA.9f?(D[X/A:]))"(D)  =  9f?^(D). 

Not  all  domain  refinements  are  invertible  in  a  satisfactory  way.  An  example  is 
provided  by  the  negative  completion  refinement  9^?^  of  Sect.  3.  In  fact,  as  observed 
in  [9],  the  optimal  basis  of  the  domain  Sign  (in  Sect.  3)  for  9f^^  does  not  exist. 
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Since  Sign  enjoys  all  most  important  lattice-theoretic  properties,  this  means  that 

is  not  invertible  on  any  really  significant  class  of  abstract  domains. 

As  the  following  result  says,  compressors  relative  to  idempotent  refinements 
are  extensive  and  idempotent. 

Proposition  4.2  :  U  — ^  uco{C)  is  idempotent  and  k-invertible  in  V,  then 

the  compressor  :  V[V/k]  7rfe(U)  is  extensive  (i.e.  7rfe(D)  C  and 

idempotent  (i.e.  £  V  =>  5?j^(D[3f?^(D)/A;])  =  (D) 

In  general,  compressors  are  neither  monotone  nor  antimonotone;  [14]  proves 
that  the  least  disjunctive  basis  operator  is  neither  monotone  nor  antimonotone, 
and  later  we  will  show  that  the  least  disjunctive  basis  is  the  compressor  relative 
to  the  disjunctive  completion  refinement.  On  the  other  hand,  as  expected,  a 
compressor  applied  to  a  refined  domain  performs  no  further  simplification. 
Proposition  4.3  // 5R  :  U  uco{C)  is  idempotent  and  k-invertible  in  V  then 
for  any  D  G  V[V/k]  such  that  3?(D)  G  V,  3^^(D[3f?{D)/^])  = 

An  n-ary  refinement  :  U  nco(C')  is  commutative  if  for  any  permutation 
r  of  5?(7r^(i)(D),...,7r^(n)(D))  ==  ^(D)  holds.  For  instance,  the  re¬ 

duced  product  refinement  is  obviously  commutative  as  well  as  any  attribute 
independent  refinement.  For  commutative  refinements,  the  following  result  holds 
(this  result  admits  a  straightforward,  although  notationally  tedious,  generaliza¬ 
tion  for  generic  iT-commutativity  and  invertibility) . 

Proposition  4.4  //  :  U  uco{C)  is  a  (possibly  nonidempotent)  commu¬ 
tative  refinement  and  k  G  then,  3?  is  fully  invertible  on 'V  C  \J  iff  "iR.  is 

k-invertible  on  7rfc(V)  iff  for  all  D  G  V,  AX.?ft(D[X//:])  :  7rA;(U)  uco{C)  is 
(l-)invertible  on  7rA:(V). 

Not  all  refinements  are  commutative.  Examples  of  noncommutative  refine¬ 
ments  are  reduced  power  [6],  dependencies  [13],  and  tensor  product  [18].  Due  to 
lack  of  space,  we  do  not  formalize  these  operators  as  refinements.  In  the  following, 
we  show  how  the  results  in  [3,  12,  14]  on  complementation  and  least  disjunctive 
basis  of  abstract  domains,  actually  permit  to  define  the  compressors  relative 
to  reduced  product  and  disjunctive  completion  respectively.  These  results  also 
suggest  a  generalization  towards  a  general  pattern  of  invertible  refinements. 
The  Inverse  of  Reduced  Product.  Since  is  commutative,  by  Proposi¬ 
tion  4.4,  is  fully  invertible  on  some  V  x  V  <Z  uco{C)'^  iff  for  any  D  G  F, 
\X.{D  n  X)  is  invertible  on  V,  As  recalled  in  Sect.  2,  for  any  meet-continuous 
complete  lattice  C  and  D  G  uco(C),  one  can  define  the  complement  abstract 
domain  C<~^D.  Moreover,  it  is  immediate  to  note  that,  for  any  complete  lattice 
(7,  if  Di,D2  G  uco{C)  satisfy  the  ascending  chain  condition  (to  be  ACC,  for 
short;  DCC  is  dual)  then  Di  n  D2  is  ACC  as  well,  and  hence  meet-continuous. 
These  observations  directly  imply  that  we  can  invert  the  reduced  product  on  the 
ACC  abstractions  of  any  concrete  domain  C  (i.e.  a  plain  complete  lattice).  Let 
us  define  ACC{C)  ~  {D  e  uco{C)  \  D  is  ACC},  for  any  complete  lattice  C. 

Theorem  4.5  If  D  e  ACC{C)  then  XX. D  H  X  is  invertible  on  ACC(C),  and 
the  corresponding  compressor  {XX. D  □  X)~  :  ACC{C)  — )■  uco{C)  is  defined  as 
{XX.DnX)-{E)  {Dr\E)^D. 
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Thus,  is  fully  invertible  on  ACC{C)  x  ACC(C).  For  instance,  if  0^02  € 
ACC{C),  we  have  that  the  first  compressor  is  i^n)i  (A,  ^2)  =  (A  n  A)  A- 

The  Inverse  of  Disjunctive  Completion.  Giacobazzi  and  Ranzato  defined 
and  studied  in  [14]  the  operator  of  least  disjunctive  basis  on  abstract  domains, 
that  corresponds  exactly  to  the  compressor  for  the  disjunctive  completion  re¬ 
finement.  Hence,  the  results  in  [14]  can  be  reformulated  as  follows. 

Theorem  4.6 

(i)  If  C  is  co-algebraic  completely  distributive  then  Is  invertible  on  all  uco{C). 

(ii)  If  C  is  distributive  then  is  invertible  on  {A  e  uco{C)  \  A  is  finite}. 

Compressing  Lower  and  Upper  Refinements.  Define  an  upper  {lower)  im¬ 
provement  on  C  as  any  map  X  :  p{C)  p(^)  such  that  \fS  G  p(C').V5  G 
S.\fs^  G  X{S).  s  <  s'  {s'  <  s).  Gib  and  lub  are  obvious  examples  of  lower 
and  upper  improvements.  We  prove  that  upper  and  lower  improvements  in¬ 
duce  invertible  refinements  in  a  natural  way.  This  provides  a  general  pattern  for 
defining  new  invertible  refinements.  For  an  upper  (lower)  improvement  X  on  C, 
define  the  corresponding  upper  {lower)  set-refinement  3?^  :  p{C)  — ^  p{C)  as: 

=  Xu  {UscxX{S)).  It  turns  out  that  is  a  lower  closure  on  {p{C),  D). 
However,  in  general,  for  a  closure  p  G  wcc?(C),  may  not  be  in  uco{C). 

But,  when  a  unary  full  idempotent  refinement  JR  G  lco{uco{C))  is  the  restriction 
on  uco{C)  of  an  upper  (lower)  set-refinement,  i.e.  there  exists  an  upper  (lower) 
improvement  X  on  C  such  that  JR  =  '^^\uco{C)  ^ 

{lower)  refinement),  the  following  general  theorem  of  inversion  for  JR  holds. 

Theorem  4.7  If  C  is  a  complete  lattice  satisfying  the  DCC  (ACC),  then  any 
upper  (lower)  refinement  JR  G  lco{uco{C))  is  invertible  on  all  uco{C). 

For  instance,  if  C  is  distributive  and  Z  ~  V,  we  get  for  free  the  inversion  of 
disjunctive  completion  of  Theorem  4.6  (ii).  By  Proposition  4.4,  the  attribute  in¬ 
dependent  refinement  induced  by  a  family  of  upper  or  lower  refinements  is  invert¬ 
ible  under  suitable  hypotheses  derived  by  Theorem  4.7.  By  this  last  observation, 
it  would  be  possible  (but  we  omit  the  details)  to  derive  as  a  consequence  of 
Theorem  4.7  the  result  of  inversion  for  the  reduced  product  of  Theorem  4.5. 
Minimal  JR-decompositions.  For  a  given  refinement  JR  :  U  — uco{C)  of  arity 
n  >  1,  we  say  that  D  G  U  is  a  ^-decomposition  of  D  G  uco{C),  ii  D  =  JR(D). 
If  D,E  G  U  are  two  JR-decompositions  of  D  then  D  is  better  than  E  if  E  C  D 
componentwise.^  The  intended  meaning  is  that  D  is  better  than  E  because 
it  is  a  less  costly  decomposition  (in  particular,  ^  ElLi  ki(E)|)* 

Obviously,  this  relation  induces  a  partial  ordering  between  JR-decompositions 
of  D,  but,  in  general,  optimal  (i.e.  least)  JR-decompositions  for  this  order  do 
not  exist.  For  instance,  {D,{T})  and  ({T},D)  are  uncomparable  minimal  JRn- 
decompositions  of  D.  It  is  easy  to  see  that,  if  JR  is  idempotent  and  fully  invertible 
on  V  C  U  and  D  is  a  JR-decomposition  of  D,  then  for  any  k  G  [1,  n]  the  tuple 
D[!R,T(D)A]  (*)  is  still  a  JR-decomposition  of  D  which  is  better  than  D,  and 


^  For  commutative  refinements,  both  this  definition  and  the  successive  development 
would  identify  decompositions  up  to  permutation  -  however,  we  omit  the  details. 
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that  D  is  a  minimal  3f?-decomposition  of  iff  Vfc  e  [l,n].  7rA:(D)  =  (D). 

Thus,  each  5?- decomposition  can  be  improved  by  iterating  the  above  step  (*)  as 
shown  in  the  following  nondeterministic  function  where  choose  selects 

an  arbitrary  element  from  its  input  set.  theorem  4.8  Let  »  be  an  idem- 

potent  and  fully  invertible  refine¬ 
ment  on  V.  If,  for  any  k  e  [l,rz], 
is  anti-monotone  then  for  any 
D  6  V,  3^?-min(D)  is  a  minimal  5R- 
decomposition  of  3f?(D) . 

Note  that,  for  a  9R-decomposition  D 
of  P,  we  can  get  at  most  n!  different 
minimal  5ft-decompositions  of  D. 
For  instance,  if  5R(A  ,D2)  =  D  then  (A ,  (A ,  ^^2)) ,  3^2"  (A ,  A))  is  a  min¬ 

imal  5?-decomposition  of  D.  Theorem  4.8  generalizes  the  results  of  [3,  Sect.  4], 
since  the  compressor  relative  to  reduced  product  is  anti-monotone  (cf.  [3]). 

5  A  Relation  of  Adjunction  between  Refinements  and  Compressors 

Assume  that  U  is  an  idempotent  n-ary  refinement  ^  :  V  uco{C)  that  is 
/^-invertible  on  V  C  7r;t(U)  C  uco{C),  for  some  k  G  We  saw  in  Sect.  4 

that  any  (/:-th)  unary  refinement  AX.9f^(D[A//:])  :  7rA;(U)  uco(C)  induced  by 
3?  is  invertible  in  V,  and  the  corresponding  compressor  (of  type  V  7r/;(U)) 
is  defined  as  {XX .^{D[X / k]))-  -  AX  €  V {T>[X / k]) .  ^  general,  the  re¬ 
finement  AX.3^(D[X/A:])  and  the  relative  compressor  AX.5?;^  (D[X//j])  do  not 
constitute  an  adjunction  on  the  poset  of  domains  ( F,  C)  of  invertibility,  i.e.  for 
all  A  G  7rfe(U)  and  B  G  V,  3^(D[A/A:])  C  B  ^  A  C  gf^-(D[5/fc]))  may  not 
hold.  This  is  due  to  the  fact  that  compressors,  in  general,  are  not  monotone, 
as  observed  after  Proposition  4.2.  Since,  by  Proposition  4.2,  compressors  are 
idempotent  and  extensive,  this  also  implies  that  compressors  AX.3?^(D[X//^]), 
well-defined  on  V,  are  not  upper  closures  on  {V,Q,  as  instead  we  would  expect 
by  viewing  compressors  as  inverses  of  refinements. 

We  solve  this  asymmetry  between  abstract  domain  refinements  and  compres¬ 
sors  by  modifying  the  standard  ordering  C  of  precision  between  domains,  so  as 
to  keep  into  account  the  role  of  3?,  We  maintain  the  above  scenario  and  also  sup¬ 
pose  that  the  refinement  AX.5R(D[X/^])  is  well-defined  in  7rfe(U),  namely  for  any 
p)  ^  7rfc(U),  ^{BlD/k])  e  7rfc(U),  and  that  (7rjfc(U),C)  is  a  complete  sublattice 
of  {uco{C),  c).  These  hypotheses  imply  that  XXM(D[X/k])  is  a  lower  closure 
on  the  complete  lattice  (7rA;(U),T).  Then,  we  define  the  following  relation 
(that  actually  depends  also  on  the  fixed  arguments  7ri(D),  i  ^  k)  on  7rfc(U): 

B  iff  ^(D[A/k])  C  ^{-D[B/k])  &  (3fJ(D[.e//:])  C  ^{I)[A/k])  ^AQB). 
Theorem  5.1  (7rA:(U),r^)  is  a  complete  lattice. 

Note  that  A^  B  An^  B.  Thus,  we  call  the  lifting  of  C  via  R.  This 
lifted  complete  partial  order  reflects  precisely  the  relative  precision  of  domains 
with  respect  to  the  refinement  AX.3?(D[X/A:]);  A  is  more  precise  than  B  in  the 
lifted  order  if  the  refinement  of  A  is  more  precise  than  the  refinement  of  B  in  the 


fun  ?R-min  (D :  array  [1,  n]  of  domains) 

J  := 

repeat 

k  choose{J); 

J  ■= 

D  :=  D[5}^(D)/i] 
until  J  =  0 
output  D 
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standard  sense  and,  when  they  are  the  same  (i.e.  (5R(D[5/A:])  —  5ft(D[i4//c])), 
then  A  contains  more  information.  For  this  ordering  we  get  back  a  relation 
of  adjunction  between  the  invertible  refinement  and  its  compressor. 

Theorem  5.2  \fA  G  7rfc(U),5  G  V.  9^(D[A/A:])  B  ^  A  (D[5/fc]))- 

As  a  consequence,  \X.  5ft^(D[A/A:])  is  an  upper  closure  operator  on  (F,C^) 
(provided  it  is  well-defined  on  V).  For  example,  we  get  an  adjunction  between 
reduced  product  and  complementation  w.r.t.  the  lifted  order.  For  any  complete 
lattice  C  and  D  G  uco{C),  the  lifted  order  on  uco{C)  is  defined  as  follows:  For 
all  A,B  e  uco{C),  AC^  B  iff  DnACDnBk{DnBCDnA  A\ZB). 
Hence,  the  adjunction  between  refinement  (reduced  product)  and  compressor 
(complementation)  is  the  following:  For  any  A  G  uco{C)  and  B  G  ACC{C), 
DnAC^B  ^  AC^  {DnB)^D. 
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1  Introduction 


Consider  the  “name-switching”  function  F  =  Xx.{li  =  x.l2,l2  ~  in  a  A- 

calculus  with  records.  Most  type  systems  would  reject  program  (F{/i  =  3})./2 
because  the  type  of  F  is  {/i  :  X,l2  '•  {^2  -yji  '  ^nd  {h  ■  X,l2  • 

cannot  be  unified  with  {/i  :  Int},  the  type  of  the  record  argument.  However 
this  program  reduces  to  3  without  error.  This  shows  that  the  common  notion  of 
“erroneous”  terms,  as  implemented  in  most  typed  languages,  is  sometimes  over- 
restrictive.  Here  we  propose  a  general  framework  for  studying  the  semantics  of 
programs  containing  “uncatchable”  errors,  and  a  language-independent  classifi¬ 
cation  of  error  propagation  properties;  this  is  then  applied  to  a  comparison  of 
various  A-calculi.  In  this  approach,  errors  (written  e)  can  be  passed  around  as 
any  other  value,  sometimes  in  a  lazy  way,  and  therefore  an  error  occurring  inside 
a  term  is  not  necessarily  propagated  to  the  top  level;  a  term  is  considered  “er¬ 
roneous”  if  and  only  if  it  always  generates  e.  We  define  an  operational  ordering 
of  terms,  called  “subsumption”,  which  gives  a  formal  foundation  for  the  notion 
of  “substitutability”  or  “safe  replacement”  often  used  informally  in  the  object- 
oriented  literature:  a  term  subsumes  another  iff  it  generates  fewer  errors  in  all 
program  contexts.  Subsumption  often  implies  and  sometimes  equals  the  usual 
approximation  ordering  (Theorems  21,  26);  its  main  interest  is  to  directly  inter¬ 
pret  subtyping  in  a  term  model,  which  is  simpler  than  the  partial  equivalence 
relations  (PERs)  of  [6]  or  the  coercion  functions  of  [5].  Since  we  require  that 
errors  are  “absorbing”  (any  attempt  to  interact  with  an  error  yields  an  error 
again),  e  is  the  top  element.  Therefore  the  semantic  structure  is  a  lattice,  like  in 
the  original  work  of  Scott  [19]. 

For  the  technical  development  below  we  make  heavy  use  labelled  reductions, 
an  old  idea  used  in  the  A-calculus  to  restrict  the  interaction  behaviour  of  a 
term  to  a  finite  number  of  steps.  Here  this  is  generalised  in  an  abstract  way  to 
other  reduction  systems.  Labelled  reductions  allow  us  to  classify  both  terms  and 
contexts  according  to  the  number  of  interaction  steps  they  can  perform,  and 
therefore  introduce  an  operational  notion  of  finite  approximation.  This  in  turn 
can  be  used  as  an  alternative  to  the  contractive  maps  of  [15]  or  the  embedding- 
projection  pairs  of  [7]  for  solving  recursive  type  equations. 
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2  Basic  definitions:  error  generation  and  preservation 

This  section  defines  a  number  of  abstract  notions,  independent  of  any  partic¬ 
ular  language.  However,  since  some  concepts  need  illustrations,  informal  ex¬ 
amples  will  be  drawn  from  the  standard  A-calculus  extended  with  constants 
and  records.  Precise  definitions  for  this  calculus  and  other  calculi  will  be  given 
later  in  Section  5.  Prior  knowledge  of  the  A-calculus  and  the  notions  of  call- 
by-name  (CBN),  call-by- value  (CBV)  and  lazy  evaluation  is  assumed;  standard 
references  are  [3,17,1].  As  a  reminder,  common  abbreviations  for  A- terms  are 
I  \x.x,K"=  Xxy.x,A  =  Xx.xx.n’^:^  AA,Y  Xf.{Xx.f{xx))(Xx.fixx))-, 
furthermore  fix. a  abbreviates  Y(Aa:.a). 

Notation.  We  consider  languages  of  the  form  (T,  V,^)  where  T  is  a  set  of 
terms,  V  C  T  is  the  set  of  values,  and  is  a  binary  relation  on  terms  {one-step 
reduction)  satisfying  eV,v  ^  v'  =>  v'  e  V  .  The  letters  a,  6,  c  range  over 
arbitrary  terms,  v,u  range  over  values.  We  assume  a  set  T  C  T  of  variables 
and  standard  notions  of  bound  and  free  variables;  the  function  FV  :  T  -^2"^ 
gives  the  free  variables  of  a  term;  letters  x,y,z  range  over  X.  and  denote 
the  sets  of  closed  terms  and  values,  i.e.  those  for  which  FV  returns  the  empty 
set.  The  substitution  of  b  for  free  occurrences  of  x  in  a  is  written  a[x  :=  b] 
Contexts  are  terms  possibly  containing  occurrences  of  a  “hole”  [-];  if  C{—\  is 
a  context,  then  C[a]  is  the  term  obtained  by  filling  the  hole  in  C[-]  with  a, 
possibly  capturing  variables.  The  set  of  contexts  is  written  T[— ];  since  there  is 
no  restriction  on  the  number  of  holes,  we  have  T  C  7~[— ]•  A  subterm  of  a  is 
a  term  a'  such  that  a  =  C[a']  for  some  C[-].  The  reflexive,  transitive  closure 
of  is  written  A  and  =  is  its  symmetric  closure;  (a  — >)  is  an  abbreviation 
for  36,  a  -)►  6.  Finally,  if  Cg  is  one  of  the  operational  ordering  relations  defined 
below,  with  9  representing  any  collection  of  subscripts/superscripts,  then  ^5)  is 
its  symmetric  closure  and  Co  is  its  strict  restriction,  i.e.  the  relation  Co  \=o. 

Definition  1  (Reduction  properties).  For  a  language/!  (T,  V,  we  say 
that 

-  a  is  stuck  iff  a  ^  V  and  -<{a 

-  a  diverges  (written  a  1))  iff,  for  each  6  such  that  a  A  6,  we  have  ((6  ^ 
V)  A  (6  ->)).  Conversely,  a  converges  (a  -If)  iff  3t)  G  V,  a  t'. 

-  -A  is  Church-Rosser  (CR)  iff  {(a  A  6)  A  (a  A  c))  =>  3d. ((6  — >•  d)  A  (c  d)) 

- ^  is  compatible  iff  a  ^  6  =>  C[a]  A  C[b]  for  any  context  C[-] 

Definition  2  (Relevant  contexts).  A  context  C[—]  is  relevant  iff  a  fl- 
C[a]  fl  and  there  is  a  term  6  such  that  C[b]  i)-. 

Example  3.  Contexts  [-],  ([-]a^),  ((^3^-[-])«^))  relevant.  The  context 

(K[— ]a)  is  relevant  with  CBV  evaluation,  but  not  with  CBN,  The  context  Aa?.[~] 
is  relevant  with  both  CBV  and  CBN,  but  not  with  lazy  evaluation. 


Definition  4  (Solvable  terms).  A  term  a  is  solvable  iff,  for  every  term  6,  there 
is  a  relevant  context  C[~]  such  that  C[a]  A  b. 

Definition  5  (Language  properties).  A  language  (T,  V, ->) 

-  has  divergence  iff  there  is  at  least  a  term  G  T  \  V  such  that  Q  ff-. 

-  is  stuck~free  iff  T  contains  no  stuck  terms. 

-  has  errors  iff  there  is  a  nonempty  subset  ^  C  V  of  error  values  satisfying 

V  e  S  =>  ->(1^  -4).  Most  often  we  will  consider  a  singleton  set  and  write  s 
to  denote  the  single  error  value.  We  write  if  a  — >  u  G  ^  . 

-  is  error- generating  iff  there  is  an  a  G  T  such  that  af®  and  for  every  subterm 
a'  of  a,  a'  ^  S. 

-  is  error- complete  iff,  for  every  value  u  G  V^,  there  is  a  relevant  context  C[—] 
such  that 

-  is  error-preserving  iff  there  are  no  relevant  context  C[-]  and  error  value 

V  £  S  such  that  C[v]  fl'. 

Some  comments  are  of  order.  Absence  of  stuck  terms  is  easily  obtained  by  adding 
an  error  term  e  and  completing  the  reduction  relation  so  that  stuck  terms  ex¬ 
plicitly  reduce  to  e.  In  that  case  the  language  is  also  error-generating.  Error- 
completeness  is  a  closely  related,  but  different  property:  we  will  show  examples 
of  languages  which  are  error-generating  but  not  error-complete,  or  vice-versa. 
Finally,  error-preservation  ensures  that  errors  are  not  observable  internally;  in 
other  words,  there  is  no  “catch”  construct  to  recover  from  errors. 

Example  6.  The  pure  A-calculus  with  an  added  error  constant  e  has  stuck  terms: 
{ea)  does  not  reduce  and  is  not  a  value.  With  an  added  reduction  rule  Va,  ea  ^ 
e  the  language  becomes  stuck-free;  however  it  is  not  error-generating.  Error- 
completeness  varies  with  the  evaluation  strategy:  with  CBN  evaluation,  all  values 
are  solvable,  and  therefore  can  become  errors  in  some  context.  By  contrast,  lazy 
evaluation  admits  values  which  are  unsolvable,  so  then  the  language  is  not  error- 
complete:  there  is  no  relevant  context  which  can  turn  Xx.Q  into  an  error. 

Example  7.  The  A-calculus  with  integers  and  integer  operators  is  error-complete, 
independently  of  the  evaluation  strategy:  this  is  because  there  are  contexts  such 
ag  ([_][_])  and  ([-]  -h  [-])  which  discriminate  between  functional  values  and 
integer  values,  even  if  they  are  unsolvable. 

Example  8.  A  language  like  the  one  in  [16],  containing  constructs  isnat,  islam, 
ispr, . .  .  for  identifying  various  syntactic  classes  of  values  such  as  numbers,  A- 
abstractions  or  pairs,  is  not  error-preserving:  for  example  the  context 

if  (islam([— ]))  then  4-  1  else  --  1 

returns  -1  for  all  terms  which  are  not  A-abstractions,  including  By  contrast, 
the  approach  of  [2] ,  who  discriminate  between  syntactic  classes  through  a  single 
construct 

cases  a  nat  :  ai  fun  :  a2  pair  :  03  .  .  .  end 
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is  error-preserving,  provided  of  course  that  the  cases  construct  has  no  “default” 
clause  and  no  clause  to  recognize  errors. 

Example  9.  The  A-calculus  extended  with  e,  with  records  {/i  —  ai  . . 
and  with  a  field  selection  construct  a./,  together  with  the  obvious  reduction  and 
error  generation  rules,  is  stuck-free,  error-generating,  error-complete  and  error¬ 
preserving. 

Following  [1,13,16],  we  can  define  approximation  in  an  operational  way: 

Definition  10  (Contextual  approximation).  Contextual  approximation 
is  defined  as: 

(a  6)  (VC[--],  C[a]  =>  C[b]  Ij-) 

In  error-preserving  languages,  since  £  always  converges,  then  i?  Cjj.  a  £ 
for  any  a. 


3  Labelled  reduction 

This  section  borrows  from  Chapter  14  of  [3]  the  idea  of  labelled  reductions.  La¬ 
belled  terms  are  obtained  from  usual  terms  by  decorating  subterms  with  natural 
numbers  which  limit  the  number  of  reduction  steps  they  can  perform.  For  ex¬ 
ample 

is  a  labelled  A-term.  Subterms  without  any  label  are  implicitly  labelled  with  oo. 
We  write  a^,  6^, . . . ,  Ct[~],  Dtl—],...  for  labelled  terms  and  contexts,  and  Ti  for 
the  set  of  labelled  terms.  Given  a  set  V  of  values,  we  define  Vi  as  the  set  of 
labelled  values  satisfying 


Vi  =  Ci[{aiY]  C[Q]  G  V 

In  other  words,  labelled  values  can  contain  0  labels  only  in  places  where  the 
corresponding  subterm,  replaced  by  a  divergent  term,  still  yields  a  value  in  the 
original  language:  this  is  typically  the  case  in  lazy  computation  systems  [12],  in 
which  the  outermost  term  constructor  is  enough  to  determine  whether  a  term  is 
a  value  or  not. 

For  defining  labelled  reduction  we  assume  that  the  original  reduction  relation 
is  given  by  a  set  of  rules  [Ihs  — )■  rhs)  in  some  form  of  rewrite  system  (possibly 
dealing  with  bound  variables,  as  in  [14,12,20]).  Operators  (function  symbols)  in 
the  left-hand  side  of  a  rule  which  are  not  at  the  outermost  level  are  called 
internal.  Given  a  left-hand  side  Ihs  of  a  rule,  a  labelling  lN[lhs)  is  obtained 
by  decorating  internal  operators  in  Ihs  with  labels  in  N .  Each  original  rule 
(Ihs  — >•  rhs)  generates  labelled  rules  of  shape 

t{n+i\n£N}{l'hs)  ->-e 
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Labelled  reduction  is  the  relation  on  %  given  by  all  such  labelled  rules,  together 
with  the  label  elimination  rules 

{lahl)  (a^)^ 

(/n62)  0^  — y(.  L2 

Example  11.  /^-reduction  on  A-terms  is  expressed  in  [14]  as  @(A([a:]^{a.’)), 

Z(Z').  The  only  internal  operator  is  A,  so  the  corresponding  rule  for  labelled  /?- 
reduction  is  {[x]Z{x)),  Z^)  {Z{Z'))^,  which  in  more  familiar  notation 

is  written 

{Xx.a)^+H  ia[x  :=  b])^ 

This  is  not  exactly  like  the  definition  of  [3],  which  reads: 

{Xx.ar+H-^(a[x  :=  6"])^ 

so  our  labelled  reductions  are  not  strongly  normalizing,  because  b  could  be  a  di¬ 
vergent  term.  Nevertheless  for  the  current  purpose  this  is  not  a  problem:  labelled 
reductions  still  introduce  an  appropriate  notion  of  finite  approximation,  as  will 
be  shown  below.  Hence  these  are  intended  as  a  general,  abstract  mechanism  to 
replace  the  language-dependent  finite  projection  functions  of  [2,16,1]. 

Example  12.  In  a  record  calculus,  the  field  extraction  rule  {li  ~  ai}.lk  o,k  has 
corresponding  labelled  rule  {li  =■  .Ik  aJJ 

Proposition  13.  //(T,  V,~^)  is  stucEfree,  with  compatible  and  Church-Rosser 
reduction,  then  so  is  its  labelled  extension  -^t) . 

Note  that  Ci  is  never  error- preserving,  as  can  be  seen  easily  by  a  context  like 
([— ]^I)  which  diverges  when  filled  with  s. 

Definition  14  (Ar-relevant  contexts).  1.  A  context  C[-]  is  k-relevant  iff  (a 
C[o]  ft)  and  there  is  a  term  b  such  that  C'[6^'^^]  4- 

2.  The  relevance  index  for  C*[— ],  written  i?/((7[— ]),  is  the  smallest  k  such  that 
C[—]  is  A:-relevant,  or  undefined  if  there  is  no  such  k. 

3.  denotes  the  set  {C[— ]  G  T[—]\RI{C[—])  =  /?}. 

The  notion  of  jt-relevance  captures  the  number  of  interaction  steps  between  a 
context  and  the  term  filling  it.  0- relevant  contexts  are  contexts  which  only  carry 
the  hole  around  without  interacting  with  it,  like  [-],(![-])  or  ({/  =  [~]}-^); 
1-relevant  contexts  include  the  O-relevant  ones,  but  in  addition  also  include  con¬ 
texts  like  ([-]!)  or  {[-]d)  which  perform  one  single  interaction  step  with  the 
hole.  More  generally,  we  have: 

Lemma  15.  1.  Any  k-relevant  context  is  also  {k  4-  l)-relevant. 

2.  A  context  is  relevant  ijf  it  is  k-relevant  for  some  /?  >  0. 

Lemma  16  (context  decomposition). 

C[-]  e  c'-'+i  ^  3C'i[C'2[-]]  =  CH.CU-]  eC^A  C2[-]  e 
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Proof.  If  k  =  0,  there  is  an  easy  solution  Ci[—]  =  C'[— ],C2[— ]  ^  [-].  If  k  >  0, 
we  know  i)  3a,  r;,  A  v  and  ii)  V6,C'[6^+^]  fl-  Suppose  v  = 

with  C[—]  A  C'[—],a  A  a^  Then  by  definition  C'[a^^+^]  must  be  a  value, 
contradicting  ii).  So  necessarily 

C[a^-+2]  A  Di[T>2K^+']]  ^  Di[b^+^]  A  ^ 

where  Z)2[«'^‘^^]  ^  is  an  instance  of  a  labelled  reduction  rule.  Now  by  rule 
(/a61),  Di[{D2[a'^'^‘^])^'^^]  A  v,  so  T>i[~]  G  C^',  moreover  02^"^]  ->  -U',  which 

implies  D2  ^  Cf .  ^ 

Now  we  can  use  relevance  indices  of  contexts  to  measure  the  interactivity  of 
terms;  intuitively,  a  term  is  ^-interactive  if  it  can  performe  k  interaction  steps. 

Definition  17  (/^-interactivity).  1.  every  term  is  0-interactive 

2.  a  is  (k  +  1) -interactive  iff  3(7[— ]  G  ,  C[a]  IJ-. 

3.  the  interactivity  index  of  a  term  a,  written  1 1  {a),  is  the  biggest  k  such  that 
a  is  A’-interactive,  or  00  if  a  is  A-interactive  for  every  A. 

4.  denotes  the  set  {a  G  T|//(a)  <  A}. 

Example  18.  ~  In  the  lazy  A-calculus  [1]  all  A- abstractions  are  values,  so  the 

term  \x.Q  is  1-interactive,  as  well  as  (Xx.a)^  for  any  function  Xx.a. 

-  In  the  standard  call-by-name  A-calculus,  the  term  Xx.xQ  is  1-interactive. 

As  demonstrated  by  these  examples,  the  notion  of  A-inter activity  not  only  applies 
to  labelled  terms,  but  also  to  unlabelled  ones.  Labels  are  used  as  an  auxiliary 
study  tool,  but  then  the  results  can  be  extracted  and  give  information  about  the 
unlabelled  language. 

4  Erroneous  Terms  and  Subsumption 

We  want  to  allow  some  errors  to  occur  inside  terms,  because  of  the  assumption 
that  these  will  not  necessarily  be  propagated  to  the  top  level.  However,  if  a 
term  contains  only  errors,  then  it  is  observationally  not  different  from  an  error 
itself.  For  example,  the  term  Xx.e  is  not  /?-equal  to  £,  but  only  yields  errors  in  any 
context  .  By  contrast,  lazy  systems  admit  unsol vable  values  like  px.Xy.x,  p.x.{l  = 
x^  which  can  interact  without  ever  generating  errors.  Hence  we  come  to  define 
the  erroneous  terms  are  those  which  always  yield  errors  after  a  finite  number  of 
interaction  steps: 

Definition  19  (Erroneous  terms).  A  term  a  is  k-erroneous,  written  af^,  iff 
C[a]  A  £  for  every  context  C[— ]  G  A  term  a  is  erroneous,  written  a|,  iff  it  is 
A-erroneous  for  some  A. 

Clearly  0-erroneous  terms  must  belong  to  the  class  {a|a  A  e}.  Examples  of 
1-erroneous  terms  are  Xx.e  or  {/  =  e}.  . 
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Definition  20  (Subsumption).  A  term  a  subsumes  another  term  6,  written 
a  Qe  iff  it  generates  fewer  errors  in  all  program  contexts: 

a  Cf  6  <=^  V(7[— ],  C'[a]t  C[b]^ 

As  for  Qiy,  we  have  f?  a  Ee  6  for  any  a  in  error-preserving  languages.  The 
obvious  question  then  is  how  the  two  orderings  relate.  This  in  general  depends  on 
the  language  properties,  as  shown  through  several  examples  in  the  next  section. 
Nevertheless,  a  general  result  can  be  stated  already: 

Theorem  21.  In  an  error-complete  language,  a  ^  ^ 

Proof.  We  will  show  (a  C,  b)  =>  (V(7[-],C[6]  fT  C[a]  fr),  from  which 
(a  Cjj.  b)  directly  follows  by  definition.  Suppose  a  b.  For  any  context  C[— ], 
furthermore  suppose  C[b]  ft'  and  C\a\  ft.  If  the  language  is  error-complete,  then 
there  exists  a  relevant  context  D[~]  with  D[C[a]]f;  but  since  D[-]  is  relevant, 
D[C[6]]  ff,  contradicting  a  6.  Hence  C[a]  must  diverge.  □ 

5  Comparing  various  lambda  calculi 

We  will  now  apply  our  abstract  framework  to  several  languages,  all  related  to  the 
A-calculus,  but  with  various  kinds  of  extensions,  and  with  two  different  notions  of 
values:  head  normal  forms  (terms  withouth  a  head  redex)  or  lazy  values  (terms 
with  an  outermost  abstraction  construct).  These  are  described  by  fairly  standard 
rules,  given  in  the  appendix.  Head  and  lazy  versions  are  distinguished  by  the 
superscripts  ^  and 

For  the  pure  A-calculus  A  the  relation  clearly  is  inconsistent  since  there 
are  no  errors.  By  contrast,  on  is  the  usual  approximation  relation,  and  its 
reflexive  closure  is  the  sensible  theory  of  [3],  equating  all  unsolvable  terms; 

on  A^  is  the  semi-sensible,  lazy  theory  of  [1],  which  equates  unsolvable  terms 
of  the  same  order.  So  in  A^  we  have  f?  YK  a  for  every  a,  while  in  A^ 
we  have  f?  a  YK.  A  detailed  discussion  of  these  different  relations  can 
be  found  in  [1]. 

Lemma  22.  1.  Xx.a  Xx.b  a  Cjj.  b 

2.  Xx.a  b  {b  A  Xx.P)  A  (a  b') 

5.1  Standard  A-calculus  with  e 

As  is  the  pure  A-calculus  with  an  added  constant  e  and  corresponding  reduction 
rule  ea  ^  s. 

Lemma  23.  In  As,  aj  <=>  a  ^  Xxi  . .  .Xn.£ 


Proof  (<=):  easy,  Xx^  . .  is  n-erroneous.  (=^>):  a  must  be  ^-erroneous  for 

some  k,  so  we  can  use  induction  on  k.  □ 
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Lemma  24.  1.  is  not  error-generating,  but  is  error-preserving. 

2.  YK  %  e. 

3.  A^  is  error- complete,  but  not  A^ . 

Proof.  1:  Easy  by  inspection  of  rules  /?  and  £.2:  Both  are  ever-convergent.  3: 
Values  in  A^  are  A-terms  in  head  normal  form,  or  s.  Since  HNFs  are  solvable, 
for  every  v  there  is  always  a  context  C[-]  such  that  C[u]t°.  By  contrast,  value 
Aiu.f?  in  never  reduces  to  an  error.  El 

Lemma  25.  /if  a  Q,  b  A^  \=  a  T,  b. 

Proof.  By  the  Lemma  23  the  error  terms  in  both  calculi  are  the  same.  □ 

Theorem  26.  1.  In  both  Af  and  A^ ,  a  6 

2.  In  Af,  a^ij,b  a  6 

Pr^oof.  1:  suppose  a  6.  By  Lemma  23,  for  any  context  C[-],  if  C[a]]  then 
C[a]  A  Xxi  . . .  Xn-S.  Therefore  by  Lemma  22  C[b]  Xxi  . . .  Xn-b'  with  £  Ejj,  6',  so 
(7[6]t.  2:  (!==»)  ipreceding  part  of  the  theorem.  (<==):  from  Theorem  21,  knowing 
that  A^  is  error-complete.  E 


5.2  A-calciilus  Avitli  records 

The  A-calculus  is  now  extended  with  records,  i.e.  collections  of  bindings  from 
names  to  terms.  As  usual,  these  are  written  with  curly  braces;  we  use  the  vector 
notation  {U  —  (i^}  to  denote  the  record  with  finite  list  of  fields  l\  =  Ui, . .  . ,  = 
a.„,  with  all  U  distinct.  The  expression  [U  =  ai  \  l)  denotes  removal  of  field  I  (if 
present)  in  a  collection  of  bindings.  Here  all  records  are  considered  as  values, 
which  is  perhaps  a  debatable  choice,  but  conforms  to  an  often  similar  choice  in 
calculi  with  tuples  [16]. 

Lemma  27.  1.  Aq  is  error- generating,  error- complete  and  error-preserving 

for  both  the  head  and  the  lazy  calculus. 

2.  A^^  \=  a  b  A^^  \=  a  Qs  b. 

Proof  1:  Error-generating:  obvious.  Error-complete:  each  closed  value  is  either  of 
record  shape  or  of  functional  shape.  In  each  case  there  is  a  context  ([—]«)  or  [— ]./ 
which  generates  an  error.  Error-preserving:  easy  by  inspection  of  the  reduction 
rules.  2:  As  for  A.  (Lemma  25):  the  error  terms  are  the  same  (although  the  proof 
here  is  slightly  more  complex,  as  error  terms  may  also  be  of  record  shape).  □ 

Since  now  even  the  lazy  calculus  is  error-complete,  the  “ogre”  YK  has  a 
different  status  than  in  A^: 


Proposition  28.  In  A{},-i(YK  =jj.  s) 
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Proof.  Because  is  error-complete  and  because  of  Theorem  21,  it  suffices  to 
show  ^(YK  e).  In  the  empty  context  [-],  there  is  no  k  such  that  YK  is 
A!-erroneous,  because  it  can  consume  an  infinite  number  of  arguments  without 
yielding  an  error.  ^ 

On  the  other  hand  there  is  a  new  term  which  is  erroneous,  namely  the  empty 
record: 

Proposition  29.  In  Aq,  {}  =e  ^ 

Proof  By  inspection  of  the  reduction  rules,  {}  cannot  interact  without  yielding 
an  error,  so  it  is  1-erroneous.  0 

However  if  the  calculus  is  augmented  with  a  record  extension  construct 
a^l  ”  b  (like  in  [18,21])  then  the  empty  record  becomes  solvable:  for  any  value 
r;  there  is  a  relevant  context  ([-]M  =  v).l  yielding  that  value,  so  in  that  case 
{}  is  not  equal  to  e. 

6  Types 

This  section  illustrates  the  usefulness  of  both  subsumption  and  labelled  reduc¬ 
tions  for  the  semantics  of  types  :  subsumption  is  a  natural  foundation  for  inter¬ 
preting  subtyping,  and  labelled  terms  are  a  natural  foundation  for  interpreting 
recursive  types,  following  the  approach  of  [7].  This  is  just  an  appetizer,  as  lack  of 
space  prevents  us  from  going  through  full  technical  developments.  Nevertheless 
the  general  approach  borrows  well-known  techniques  and  therefore  should  be 
easy  to  follow. 

Types  are  interpreted  as  non-empty,  downward-closed  subsets  of  terms  in 
the  Ce  ordering.  Let  Tset  denote  the  set  of  such  subsets.  For  any  t  G  Tset, 
denotes  the  set  {a'^la  G  t}  (finite  projection).  A  type  environment  77  is  a  mapping 
from  Tvar  to  Tset.  Given  a  type  environment,  a  type  interpretation  function 
Ti[-]  maps  types  to  members  of  Tset.  We  will  illustrate  this  approach  on  the 
calculus  of  the  previous  section,  considering  types  of  the  following  syntax. 

T,U  r-J\X\T  \  {/,•  :  Ti)  \  pXT 

Type  assignment  rules  and  subtyping  rules  are  not  displayed  here:  standard  rules 
are  assumed  (see  for  example  [8]).  We  also  assume  a  rule  [top)  assigning  type 
T  to  any  term.  Figure  1  gives  the  type  interpretation.  A  well-known  difficulty 
associated  with  recursive  types  is  the  fact  that  arrow  types  are  contravariant  on 
the  left.  The  ideal  model  of  [15]  solves  the  problem  through  contractive  maps  on 
ideals  in  the  semantic  domain;  this  requires  some  conditions  on  the  syntax  of  type 
expressions  to  enforce  contractiveness.  By  contrast  we  follow  here  the  idea  of  [7], 
using  a  family  of  indexed  type  interpretations,  where  the  index  denotes  finite 
approximations.  In  this  approach  non-contractive  type  expressions  are  naturally 
mapped  to  the  bottom  type  (the  one  containing  only  divergent  terms),  without 
any  syntactic  constraints.  With  labelled  terms  this  can  be  done  in  an  operational 
way,  without  needing  to  resort  to  denotational  semantics. 
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Ti[T]°  =  {ala  C.  Q} 

Ti[T];;+'  =  r’"+' 

Ti[X];;+'  =r7(A0”+' 

Ti[T  -4  U]^+^  -  {a  €  r"+M&  €  Ti[T];;  a(6)  €  T^f/]”} 

Ti[{/rT7T}];;+^  =  {a  €  r"+MVi,a./»  €  Ti[T.]:;} 

Ti[r].,  =  {alVn  G  a;,a"  6  TifT]”} 

Fig.  1.  Type  interpretation  for  functions  and  records 


Lemma  30.  VT,  yy,  Ti[T]T,  G  Tset. 

Lemma  31.  T  <U  Ti[T];j  C  Ti[U]t]. 

Definition  32.  A  closing  substitution  cr  satisfies  a  basis  F ,  written  cr  |=  T,  iff, 
V7/,VxG  do7n{r),(7(x)  G  Ti[r(j^)]r,. 

Theorem  33.  F  a  :  T  =>  (Vcr  |=  T,  acr  G  Ti[T]). 

Definition  34  (Tiuvial  types).  The  set  Triv  of  trivial  types  is  defined  induc¬ 
tively  as: 

Ti-iv  =  TU{T^U\U  e  Triv}  U  {{/;  2}  €  Triv)  U  {//(X)T|T  G  Triv} 

Lemma  35.  In  any  non-trivial  type  environment,  non-trivial  types  do  not  con¬ 
tain  erroneous  terms,  (y  is  non-trivial  iff  e  ^  t][X)  for  each  type  variable  X  in 
dom(rj) ) 

Theorem  36.  If  F  a  :  T  and  T  ^  Triv,  then  Vcr  |=  T,  -'(aa't). 

Proof  Consequence  of  the  preceding  lemma  and  of  subject  reduction,  shown 
using  standard  techniques.  D 

Lemma  37.  The  following  equality  between  record  types  is  sound: 

{l:T,TnF}^{kTTi} 

Proof  Since  e  G  Ti[T],  the  condition  a.h  G  Ti[7;]  on  field  I  is  always  satisfied, 
even  for  records  where  field  I  is  absent.  n 

Example  38.  The  example  of  the  introduction 

(Xx.{li  —  x.l2,h  =  3:.li}){li  =  3} 

has  type  {/}  :TJ2  -  Inf},  which  is  equal  to  {h  :  Int)  and  is  non-trivial. 
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A  Language  Rules 

A.l  Standard  A-calculus  with  e 


Syntax 

Red.  Rules 

/  rt\  /  M  h  ^  ^  ^  r 

{\x.a)b  -y  a[x  :=  6J  Ai.a  -4-  \x.b  (sa)  -y  e 

niih  ,  .xeX,aeT,x^FV{a) 

(ac)  (be)  (ca)  (cb)  Xx.ax  ->  a 

Values 

(wa)  €  H  t)  6  V 

‘■^'-KVJv  W.6V 

A. 2  A-calculus  with  records 


Syntax 

,  ,  yi,ai  eT  ^  ^  aeT 

(^>{/<=a,}GT  '")a./GT 

Red.  Rules 

> 

III 

m 

''  {/,•  =:  Oj}./  — >  aj  "  {/j  —  Uj}./  ->  £■ 

{Xx.a).l  -4  5  {{/,■  =  a.}  6)  ->  e 

/I  [.  n  h  ° 

(l'’l'{...,/i=a,,...}-4{.,../i  =  a',...} 

Values 

/  ,  Vj,  aj  EiT  /  ^ 

=  ai}  en  ^‘^“^aev  ^'^K.ie'H 
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Abstract.  We  study  the  dynamical  behavior  of  D-dimensional  linear 
cellular  automata  over  Zm-  We  provide  easy-to-check  necessary  and  suf¬ 
ficient  conditions  for  a  D-dimensional  linear  cellular  automata  over  Zm 
to  be  sensitive  to  initial  conditions,  expansive,  strongly  transitive,  and 
equicontinuous. 


1  Introduction 

Cellular  Automata  (CA)  are  dynamical  systems  consisting  of  a  regular  lattice 
of  variables  which  can  take  a  finite  number  of  discrete  values.  The  global  state 
of  the  CA,  specified  by  the  values  of  all  the  variables  at  a  given  time,  evolves 
in  synchronous  discrete  time  steps  according  to  a  given  local  rule  which  acts 
on  the  value  of  each  single  variable.  CA  have  been  widely  studied  in  a  number 
of  disciplines  (e.g.,  computer  science,  physics,  mathematics,  biology,  chemistry) 
with  different  purposes  (e.g.,  simulation  of  natural  phenomena,  pseudo-random 
number  generation,  image  processing,  analysis  of  universal  model  of  computa¬ 
tions,  cryptography).  For  an  introduction  to  the  CA  theory  and  an  extensive 
and  up-to-date  bibliography  see  [7]. 

CA  can  display  a  rich  and  complex  temporal  evolution  whose  exact  determi¬ 
nation  is  in  general  very  hard,  if  not  impossible.  In  particular,  some  properties 
of  the  temporal  evolution  of  general  CA  are  undecidable  [3,  4,  10].  Despite  their 
simplicity  that  makes  it  possible  a  detailed  algebraic  analysis,  linear  CA  over  Zm 
(CA  based  on  a  linear  local  rule)  exhibit  many  of  the  complex  features  of  general 
CA.  Several  important  properties  of  linear  CA  have  been  studied  during  the  last 
few  years  [1,  5,  8,  9,  12,  13]  and  in  some  cases  exact  characterizations  have  been 
obtained.  As  an  example,  in  [9]  the  authors  present  criteria  for  surjectivity  and 
injectivity  of  linear  CA,  while  in  [2]  the  authors  present  criteria  for  topological 
transitivity  and  ergodicity. 

In  this  paper  we  investigate  the  topological  behavior  of  linear  D-dimensional 
CA  over  Zm  ■  We  focus  our  attention  on  a  number  of  topological  properties  which 
are  widely  recognized  as  fundamental  in  the  determination  of  the  qualitative  be¬ 
havior  of  any  discrete  time  dynamical  system,  namely  sensitivity  to  initial  condi- 
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Property 

Characterization 

Reference 

Surjectivity 

gcd(m,,  Ai, . . . ,  As)  =  1 

[9] 

Injectivity 

(VpeP)  (3!A.):p;A, 

[9] 

Transitivity 

gcd(?7?,  A2, . . . ,  As)  =  1 

[2] 

Sensitivity 

(3pGT):p/gcd(A2,...,As) 

This  paper 

Expansivity 

gcd(m,  oi, . . .  ,  ar)  =  gcd(m,  a_i , . 

.  .  ,a--r)  =  l 

This  paper 

Equicontinuity 

(VpGT)  p\  gcd(A2,...,As) 

This  paper 

Strong  Trans. 

(VpGT)  (3Ai ,  Aj ): p/At  ApfXj 

This  paper 

Fig.  1.  Characterization  of  set  theoretic  and  topological  properties  of  linear  C A  over 
Zm  in  terms  of  the  coefficients  A*’s  (for  dimensional  CA)  or  a^'s  (for  1-dimensional 
CA).  V  denotes  the  set  of  prime  factors  of  m. 


tions,  expansivity,  equicontinuity,  and  strong  transitivity.  The  main  contribution 
of  this  paper  consists  in  efficiently  computable  criteria  for  deciding  whether  a 
linear  CA  satisfies  one  of  the  above  four  properties.  Our  criteria  are  reported  in 
Fig.  1  and  are  given  in  terms  of  the  coefficients  of  the  linear  local  map  associated 
to  the  CA.  Note  that,  using  our  criteria,  one  can  easily  construct  a  linear  CA 
which  satisfies  any  combination  of  the  above  properties.  The  criteria  we  propose 
require  only  gcd  computations  and  can  be  checked  in  polynomial  time  in  the 
number  of  coefficients  and  in  the  logarithm  of  the  cardinality  of  the  alphabet. 
The  dimension  of  the  lattice  does  not  explicitly  affect  the  computational  cost  of 
our  criteria.  The  results  of  this  paper  hold  for  every  dimension  D  >  1  and  for 
every  m  >  2.  Our  results  show  that  linear  CA  over  have  dynamical  aspects 
that  linear  CA  over  finite  fields,  such  as  with  p  prime,  cannot  have. 


2  Basic  definitions 

Let  Z^n,  >  2,  denote  the  ring  of  integers  modulo  m.  We  consider  the  space  of 
configurations 

~  {^1  ■ 

which  consists  of  all  functions  from  Z^  into  Z,n.  Each  element  of  can  be 
visualized  as  an  infinite  D-dimensional  lattice  in  which  each  cell  contains  an 
element  of  Z„^.  A  special  configuration  is  the  null  configuration  0  which  has  the 
property  that  0(v)  =  0  for  all  v  G  . 

Let  .s  >  1.  A  neighborhood  frame  of  size  s  is  an  ordered  set  of  distinct  vectors 
Ui,U2,...,u,  G  Z^.  Given  any  function  /:ZJ,,  -A  Z^,  a  D-dimensional  CA 
based  on  the  local  rule  f  is  the  pair  {C^,F),  where  F :  ^  ,  is  the  global 

transition  map  defined  as  follows.  For  every  c  G  the  configuration  F{c)  is 
such  that  for  every  v  G  Z^ 


[F(c)](v)  3.  /  (c(v  -f  ui ),...,  c(v  +  u,))  , 


(1) 
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In  other  words,  the  content  of  cell  v  in  the  configuration  F{c)  is  a  function  of 
the  content  of  the  cells  v  +  Ui , . . . ,  v  +  in  the  configuration  c.  Note  that  the 
local  rule  /  and  the  neighborhood  frame  completely  determine  F . 

A  map  /:  Zm,  is  linear  if  and  only  if  there  exist  Ai,...,A5  e  Z^ 

such  that  (mod  m).  From  now  on,  we  say  that  a 

CA  defined  over  Z^  is  linear  if  the  local  rule  on  which  it  is  based  is  linear  over 
Z„T  Note  that  for  a  linear  D-dimensional  CA,  equation  (1)  becomes 


[^(c)](v)  =  E  Aic(v-fUj)  mod  m. 

i=:l 

We  define  the  radius  of  the  linear  CA  ,  F)  as 

p{F)  -  max{||ui||oo,  1  <  ^  <  s},  (2) 

where  the  maximum  is  restricted  to  the  indices  i  such  that  Af  ^  0  (mod  m).  As 
usual,  ||v||oo  denotes  the  maximum  of  the  absolute  value  of  the  components  of  v. 
For  linear  1-dimensional  CA  we  use  a  simplified  notation.  A  local  rule  of  radius  r 
is  written  as  f{x^r,  •  •  • ,  ^r)  =  mod  m,  where  at  least  one  between 

a_r  and  is  nonzero.  Using  this  notation,  the  global  map  D  of  a  1-dimensional 
CA  with  p(F)  =  r  becomes 


r 

[D(c)](2)  =  ^  ajc{iFj)  modm,  i  G  Z. 

j  =  -r 


In  order  to  study  the  topological  properties  of  D-dimensional  CA,  we  intro¬ 
duce  a  distance  over  the  space  of  the  configurations.  Let  A:  Z^  x  Z^n  “>  {0, 1} 
defined  by  A{i,j)  =  0  if  z  =  j  and  A{i,j)  =  1  otherwise.  Given  a,b  E  the 
Tychonoff  distance  c/(a,  6)  is  given  by 


d(a,  b) 


ST'  ^(a(v),fc(v)) 

veZz> 


(3) 


It  is  easy  to  verify  that  d  is  a  metric  on  and  that  the  topology  induced  by  d 
coincides  with  the  product  topology  induced  by  the  discrete  topology  of  Z^ . 


2.1  Topological  Properties 

In  this  section  we  recall  the  definitions  of  some  topological  properties  which  de¬ 
termine  the  qualitative  behavior  of  any  general  discrete  time  dynamical  system. 
Here,  we  assume  that  the  space  of  configurations  X  is  equipped  with  a  distance 
d  and  that  the  map  F  is  continuous  on  X  according  to  the  topology  induced  by 
d  (for  CA,  Tychonoff  distance  satisfies  this  property).  We  denote  by  B[x,€)  the 
(open)  set  {y  £  X:  d{x,  ^)  <  e}. 
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Definition!  (Sensitivity).  A  dynamical  system  {X,F)  is  sensitive  to  initial 
conditions  if  and  only  if  there  exists  6  >  0  such  that  for  any  x  e  X  and  for  any 
e  >  0,  there  exists  y  E  F(x,e)  and  n  >  0,  such  that  d(F^(x),  (y))  >  S.  The 

value  S  is  called  the  sensitivity  constant.  D 

Intuitively,  a  map  is  sensitive  to  initial  conditions,  or  simply  sensitive,  if  there 
exist  points  arbitrarily  close  to  x  which  eventually  separate  from  x  by  at  least 
S  under  iteration  of  F.  Note  that  not  all  points  near  x  need  eventually  separate 
from  X  under  iteration,  but  there  must  be  at  least  one  such  point  in  every 
neighborhood  of  x. 

A  property  stronger  than  sensitivity  is  expansivity.  Expansivity  differs  from 
sensitivity  in  that  all  nearby  points  must  eventually  separate  by  at  least  6.  It  is 
easy  to  verify  that  expansive  CA  are  sensitive  to  initial  conditions. 

Defiiiition2  (Expansivity).  A  dynamical  system  (X,F)  is  expansive  if  and 
only  if  there  exists  J  >  0  such  that  for  every  x,y  E  X  there  exists  >  0  such 
that  d(E"(x-),  E'^(y))  >  8.  The  value  6  is  called  the  expansivity  constant.  □ 

If  a  dynamical  system  is  sensitive  to  initial  conditions  or,  even  worse,  expan¬ 
sive,  then  its  dynamics  defies  numerical  approximation.  As  an  example,  round-off 
errors  may  become  magnified  upon  iterations  of  F  and  the  results  of  the  numeri¬ 
cal  computation  of  an  orbit,  no  matter  how  accurate,  may  be  completely  different 
from  the  real  orbit. 

Definitions  (Equi continuity  at  x).  A  dynamical  system  (X,F)  is  equicoii- 
tiniious  at  x  E  X  and  only  if  for  any  <5  >  0  there  exists  e  >  0  such  that  for  any 
y  E  B{x,  e)  and  n  >  0  we  have  dlF'^lx),  F^(y))  <8.  □ 

Definitiond  (Equi continuity).  A  dynamical  system  (X,  F)  is  equicontinuous 
if  and  only  if  it  is  equicontinuous  at  every  x  E  X.  □ 

The  notions  of  sensitivity  and  equicontinuity  are  related.  In  fact,  by  compar¬ 
ing  the  definitions  one  can  easily  see  that 

F  is  not  sensitive  3x:  F  is  equicontinuous  at  x.  (4) 


Definitions  (Strong  transitivity).  A  dynamical  system  {X,F)  is  strongly 
transitive  iff  for  all  nonempty  open  set  U  C  X  we  have  (JS  F^(U)  -  X.  □ 

A  strongly  transitive  map  F  has  points  which,  under  iteration  of  F ,  move 
from  one  arbitrarily  small  neighborhood  to  all  the  space  of  configurations  A. 
A  weaker  notion  is  transtUvity:  a  map  F  is  transitive  iff  for  all  nonempty  open 
set  U  the  set  ^  subset  of  A.  Clearly,  strongly  transitive 

maps  are  transitive,  and  in  view  of  [2,  Theorem  6]  ergodic  with  respect  to  the 
normalized  Haar  measure. 
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3  Statement  of  the  new  results 

In  this  section  we  state  the  main  results  of  this  paper.  The  same  results  are 
summarized  in  Fig.  1. 

Theorem  6.  Let  F  denote  the  global  transition  map  of  a  linear  D-dimensional 
CA  over  Z^.  defined  by 

s 

[C(c)](v)  =  5]  Aic(v  +  Ui)  mod  m.  (5) 

i  =  l 

Assume  ui  =  0,  that  is,  Ai  is  the  coefficient  associated  to  the  null  displacement. 
The  global  transition  map  F  is  sensitive  if  and  only  if  there  exists  a  prime  p  such 
that 

p\m  and  p/gcd(A2,  A3, . . . ,  A^).  (6) 

In  other  words,  F  is  sensitive  unless  every  prime  which  divides  m  divides  also 
all  the  coefficients  Af ’s  with  i  \.  □ 

Note  that  we  can  check  the  above  condition  without  knowing  the  factorization 
of  m.  In  fact,  (6)  holds  if  and  only  if  gcd(A2,  A3, . . . ,  A^)  does  not  contain  all  the 
prime  factors  of  m.  Since  each  prime  appears  in  m  with  a  power  at  most  [log2  mj , 
F  is  sensitive  if  and  only  if  [gcd(A2,  A3,  ... ,  ^  0  (mod  m). 

Theorem  7.  Let  F  denote  the  global  transition  map  of  a  linear  1-dimensional 
CA  over  with  local  rule  , . . . ,  aiXi  mod  m.  The  global 

transition  map  F  is  expansive  if  and  only  if 

gcd(m,a_r,  •••,«-!)  =  1  and  gcd(m,  ui , . . . ,  Ur-)  =  1.  (7) 

□ 

Note  that  by  Theorem  5.3  in  [6]  we  know  that  expansive  CA,  whether  linear  or 
not,  do  not  exist  in  any  dimension  D  >  2. 

Theorems.  Let  F  denote  the  global  transition  map  of  the  linear  D-dimensional 
CA  over  Zm  defined  by  (5).  The  following  statements  are  equivalent:  (i)  F  is 
eqmcontinuous  in  at  least  one  point,  [ii)  F  is  equicontinuous  at  every  point,  and 
[Hi)  for  each  prim.e  p  such  that  p\m  we  have  p\  gcd(A2,  A3, . . . ,  A^).  □ 

By  Theorem  8  and  (4),  a  linear  CA  is  either  sensitive  or  equicontinuous.  Hence, 
F  is  equicontinuous  if  and  only  if  [gcd(A2,  A3, ... ,  =  0  (mod  m). 

Theorem  9.  Let  F  denote  the  global  transition  map  of  a  linear  D-dimensional 
CA  over  Z^  defined  by  (5).  The  global  transition  map  F  is  strongly  transitive  if 
and  only  if  for  each  prime  p  such  that  p\m,  there  exist  at  least  two  coefficients 
Xi,Xj  such  thatpfXi  andpfXj.  □ 

We  can  check  whether  F  is  strongly  transitive  without  knowing  the  factorization 
of  m.  In  fact,  the  above  condition  is  equivalent  to  gcd(m,  Ai,  A2, . . . ,  A^-i)  = 
gcd(7?r,  Ai,  A2, .  .  .,Xs-2As)  -  •  •  *  -  gcd(m,  A2,  A3,  . .  .  ,  A^)  =  1. 
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4  Proof  of  the  main  theorems 

We  now  prove  the  results  stated  in  Sect.  3.  Due  to  limited  space  the  proof  of  The¬ 
orem  8  is  reported  in  [11].  In  our  proofs  we  make  use  of  the  formal  power  series 
(fps)  representation  of  the  configuration  space  (see  [9,  Sec.  3]  for  details).  For 
D  =  1,  to  each  configuration  c  G  we  associate  the  fps  Pc{X)  -  EigZ 
The  advantage  of  this  representation  is  that  the  computation  of  a  linear  map 
is  equivalent  to  power  series  multiplication.  Let  be  a  linear  map 

with  local  rule  /(a;_r ,  •  •  • , associate  to  F  the  finite  fps 
=  EL-r  -  Then,  for  any  c  G  we  have 

Pf{c)(X)  =  Pc{X)Aj{X)  (mod  m).  (8) 

Note  that  each  coefficient  of  Pf{c){X)  is  well  defined  since  Aj{X)  has  only 
finitely  many  nonzero  coefficients.  Note  also  that  the  finite  fps  associated  to  F 
is  Af(X).  More  in  general,  to  each  configuration  c  e  associate  the  formal 

power  series 

Pc{X\, . . .  ,Xd)  —  ^  c{ii, . . .  Ad)X\^  ■  •  ■  X]f^ . 

z  1 , . . .  D  G  Z 

The  computation  of  a  linear  map  F  over  is  equivalent  to  the  multiplication 
by  a  finite  fps  A{Xi, . . .  ,Xp)  which  can  be  easily  obtained  by  the  local  rule  / 
and  the  neighborhood  frame  ui, .  .  . ,  u^.  The  finite  fps  associated  to  the  map  F 
defined  by  (5)  is  A(Xi^ . . . ,  Xp)  =  Ei=i  ^  •  'Xj^  ^  ^  where  Ui(j)  de¬ 

notes  the  j-th  component  of  vector 

Throughout  the  paper,  given  a  fps  P(X)  and  i  G  Z,  we  use  (ff(X))^  to 
denote  the  coefficient  of  X^  in  i/(X). 

4.1  Sensitivity 

In  this  section  we  characterize  sensitive  linear  CA.  We  prove  our  results  only  in 
the  2-dimensional  case,  since  the  proofs  for  the  other  dimensions  are  similar. 

Let  F:C^  ->  denote  the  global  transition  map  of  a  2-dimensional  CA. 
For  any  integer  A;  >  0,  let  Vk  denote  the  set  of  configurations  c  G  such  that 
c(v)  =  0  for  ||v||oo  <  k.  It  is  straightforward  to  verify  that  F  is  sensitive  if  and 
only  if  there  exists  (^  >  0  such  that  for  any  configuration  c  G  we  have 

VA-  Bc'eVk-  d(F"'(c+ c'),T”(c))  >  ^  for  some  n  >  0.  (9) 

In  fact,  (9)  implies  that  we  can  find  a  configuration,  arbitrarily  close  to  c,  whose 
distance  from  c  exceeds  <5  after  a  sufficiently  large  number  of  iterations. 

If  F  is  linear  we  can  get  rid  of  the  initial  configuration  c.  In  fact,  we  have 

d{F^(cFF),F^c))  =  d(F^c)FF^{F),F^{c))  =  d{F^{c'),0). 

Hence,  F  is  sensitive  if  and  only  if 

VA  3c'  G  Va::  d(F^(c'),  0)  >  S  for  some  n  >  0. 

This  observation  leads  to  the  following  lemma. 


(10) 
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Lemma  10.  Let  F  denote  the  global  transition  map  of  a  linear  D-dimensional 
CA  over  -  F  is  sensitive  if  and  only  if 

limsiip/)(F”)  =  oo;  (11) 

n-)-oo 

(the  radius  p  of  a  CA  is  defined  by  (2)j. 

Proof  We  prove  the  result  for  D  =  2.  If  (11)  does  not  hold,  there  exists  M  such 
that  p(F"  )  <  M  for  all  n.  Thus,  if  Ar  >  M,  for  all  c  G  Vk  we  have  F^{c)  EVk-M- 
Elementary  calculus  shows  that  c  E  Vt  =>  d(c,0)  <  Hence,  for  any  6, 

if  k  is  large  enough  c  E  Vk  implies  d(F^{c),0)  <  S  for  all  n,  and  F  cannot  be 
sensitive. 

Assume  now  (11)  holds.  Then,  for  every  k  we  can  find  n  such  that  p{F'^)  = 
z  >  k.  Let  A-”'\  denote  the  coefficients  and  the  displacements  of  the  local 
map  associated  to  F^.  p{F^)  =  z  implies  that  there  exists  j,  such  that  ^  0 

and  ||u[’^^||oo  —  -2^-  Let  c  be  such  that  c(— uj”^)  =  1,  and  c(v)  —  0  for  v  / 

Clearly,  c  G  Vk  and  [T’^(c)](0)  ==  Aj'^^  0  which  implies  (10).  □ 

Proof  of  Theorem  6  Let  F  denote  the  global  transition  map  of  a  linear  2- 
dimensional  CA,  and  let 

A(X,Y)=  aijX'vr 

v<i<w 

y<3<^ 

denote  the  finite  fps  associated  to  F .  Assume  (6)  holds.  Then,  there  exist  a  prime 
p  and  a  coefficient  Us^u  such  that  p|m,  pfog^u  and  at  least  one  between  s  and  u 
is  nonzero.  We  now  prove  that,  as  a  consequence,  \imsup p{F^)  —  oo.  Without 
lo.ss  of  generality,  we  can  assume  s  0,  and  that  for  i  <  s  we  have  p\aij. 
Let  A(X,Y)  =  A(X,Y)  mod  p.  By  our  assumptions,  A[X,Y)  can  be  written  as 
A-X;(y)  +  with  G{Y)  y^  0.  Hence, 

(yr’(A-.y)modp)  =  i"(X,y)=X"*G"(y)+  Y. 

ns<.i<nw 

Since  Zp  is  an  integral  domain,  we  have  G^{Y)  ^  0  which  implies  p(T^)  >  n|s|. 

Assume  now  p|m  p\Xi  for  all  i  ^  1.  Let  m  =  •  •  -p^"  denote  the 

factorization  of  m,  and  let  k  =  max^  /?*.  We  prove  that  p{F'^)  <  p(F){k  ~  1).  Let 
b-ij  denote  the  coefficients  of  the  fps  associated  to  F^ .  We  have 

~  '  ^in,jn-  (1^) 

*1  H - h»n  =» 

J  l  H - \-jn  ~j 

If  max(|z|,|y|)  >  p{F){k  -  1),  each  term  contains  at  least 

k  coefficients  with  max(|z/j|,  \jh\)  y^  0.  Hence,  p|m  '  •  ‘^indn) 

and  each  term  in  the  sum  (12)  is  a  multiple  of  m.  Hence,  p(T”)  <  p[F){k  —  1) 
and  by  Lemma  10  T  is  not  sensitive.  □ 
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4.2  Expansivity 

In  this  section  we  characterize  expansive  linear  CA.  Since  expansive  CA  do  not 
exist  in  dimension  D  >2  (see  [6,  Theorem  5.3])  we  can  restrict  ourselves  to  the 
1-dimensional  case. 

Let  F:C^  ^  denote  the  global  transition  map  of  a  1-dimensional  CA.  It 
is  straightforward  to  verify  that  F  is  expansive  if  and  only  if  there  exists  ^  >  0 
such  for  any  configuration  c  G  we  have 

Vc'  G  Cl  3n  >  0:  d{F^\c  +  c'),  F^{c))  >  S. 

Reasoning  as  in  Sect.  4.1,  if  F  is  linear  we  can  get  rid  of  the  particular  configu¬ 
ration  c.  We  have 

d(F"(c-fc'),r^(c))  =  d(F”(c)  +  F”(c'),F^(c))  =  d(F^(F),0). 

Hence,  F  is  expansive  if  and  only  if  for  any  c'  G  Cl  we  have  d(F"(c'),0)  >  J 
for  a  sufficiently  large  n.  Clearly,  this  is  equivalent  to  assuming  that  there  exists 
M  >  0  such  that 

Vc'  G  cl  3n  >  0:  ^  0  for  some  i  with  jz'l  <  M. 

For  any  integer  A’  >  0,  let  Wk  denote  the  set  of  configurations  c  G  Cl  such  that 
c(?')  ==  0  for  |i|  <  A  and  at  least  one  between  c(A)  and  c(— A)  is  different  from 
zero.  Since  6  can  be  chosen  arbitrarily,  we  have  that  F  is  expansive  iff  3A  such 
that  for  all  A  >  A 

Vc'  G  yVk  377.  >  0:  ^  some  i  with  |i|  <  M.  (13) 

If  we  visualize  each  configuration  as  a  biinfinite  array,  (13)  tells  us  that  the 
essential  feature  of  expansive  maps  is  that  any  pattern  of  nonzero  values  can 
“propagate”  from  positions  arbitrarily  away  from  0  up  to  a  position  i  with  |z|  < 
M.  Informally,  we  say  that  any  nonzero  pattern  can  propagate  for  an  arbitrarily 
large  distance.  For  a  comparison,  sensitive  1-dimensional  linear  CA  can  be  seen  as 
those  CA  in  which  for  each  t  >  0  there  exists  a  nonzero  pattern  which  propagates 
by  at  least  t  positions. 

Proof  of  Theorem  7  (sketch)  First  we  prove  that  (7)  is  a  necessary  condition 
for  expansivity.  Assume  for  example  gcd (a i, .  . . ,  a^)  =  gi  >  1,  and  let  q2  =  rn/qi. 
For  any  integer  A  >  0  let  Ck  G  Wk  denote  the  configuration  defined  by  Ck{i)  =  q2 
if  i  =  A  and  Ck  (?)  =  0  otherwise.  We  show  that  for  every  n  >  0  and  i  <  A  we  have 
[F”  {cA-)](i)  =  0  which  implies  that  F  is  not  expansive.  Let  A{X)  = 
be  the  finite  fps  associated  to  /.  Since  the  fps  associated  to  Ck  is  q2X^\  we  have 

By  hypothesis,  for  j  <  0,  {A(A))j-  is  a  multiple  of  qi.  Since  the  same  is  true  for 
A"(A’'),  for  i  <  k  we  have  lF^{ck)](i)  ~  0  (mod  m)  as  claimed. 

Now  we  prove  that  condition  (7)  implies  expansivity.  Let  c  G  Cl  such  that 
c(-i;)  ^  0  and  c(z)  =  0  for  i  >  v.  We  show  that  gcd(7?7,  a_i, . .  . ,  a_r)  =  1  implies 


that  for  any  integer  w  there  exists  n  such  that  ^  0  for  some  i  >  w. 

This  proves  that  any  one-sided  nonzero  pattern  can  propagate  arbitrarily  far 
away  to  the  right.  Similarly,  gcd{m,  Ui, . .  . ,  a^)  =  1  implies  that  any  one-sided 
nonzero  pattern  can  propagate  arbitrarily  far  away  to  the  left.  Combining  these 
two  facts  we  get  (13)  (the  details  will  be  given  in  the  full  paper). 

Let  c  6  such  that  c{v)  ^  0  and  c(2)  =  0  for  i  >  v,  and  let  C{X)  = 
associated  fps.  Since  m/c^,  there  exists  a  prime  p  and  an 
integer  k  such  that  p^\m  and  Let  A{X)  ^  denote  the  finite 

fps  associated  to  /.  Since  gcd(m,a_i,  .  .  .,a_r)  ^  1,  we  can  find  C  0  <  t  <  r, 
such  that 

pj{a^t  and  p|a„i  for  t  <  i  <  r.  (14) 

Under  these  assumptions  we  show  that  if  n  is  a  multiple  of  p^(k  -  1)!  then 
[T"(c)](i;  +  nt)  =  {C(X)A^{X))^^^,  ^  0  (mod  m). 

Clearly,  this  proves  our  claim  that  every  one-sided  nonzero  pattern  propagates 
arbitrarily  far  away  to  the  right.  Let  A{X)  —  ^[X)  modp^.  By  (14)  we  know 
that  A[X)  satisfies  the  hypothesis  of  Lemma  A. 4  of  [11].  Hence,  if  n  is  a  multiple 
of  p^{k  -  1)!,  we  have  A^(X)  =  gcd(a„t,p^)  =  1.  We  have 


[F"(c)](t;  +  nt)  =  (i"(X)C(X))^  (mod  p*; 


(mod  p^) 


=  antCv  (modp^). 


Since  ^  and  p/unt ,  lF^ic)]{v  -f  nt)  is  not  a  multiple  of  p^  We  conclude  that 
[F'^(c)](z;  -t-  nt)  ^  0  (mod  m)  as  claimed.  D 


4.3  Strong  transitivity 

In  this  section  we  give  a  characterization  of  strongly  transitive  linear  CA.  The 
proof  is  quite  complex  and  we  will  need  some  preliminary  lemmas.  To  simplify 
the  notation  we  consider  only  the  1-dimensional  case;  the  proof  for  dimensions 
D  >  1  is  analogous  and  will  be  given  in  the  full  paper.  Let  V;.  =  {a:  G  a^(0 
0  for  |?'|  <  k}.  For  any  x  £  let 

V{x,  k)  =  X  +  Vk  =  {y  y  =  X  z,  z  £  Va:}. 

For  any  nonempty  open  subset  U  C  we  can  find  x  £  X  and  e  >  0  such  that 
B(x,  c)  C  U .  Elementary  calculus  shows  that 

V{x,Z-^  \\og(l/c)])  C  B{x,e)  C  U, 

hence  F  is  strongly  transitive  if  and  only  if 

+  00 

\/x\/k  y  F"(D(a:,i;))  =  C.  (15) 

n  =  0 

We  are  now  ready  to  establish  a  simple  condition  which,  for  linear  maps,  implies 
strong  transitivity. 
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Lemma  11.  Let  F  be  a  linear  Fdimensional  map  over  Zm-  if,  for  all  k,  there 
exists  Ilk  such  that  F^>^(Vk)  -  then  F  ts  strongly  transitive. 

Proof.  For  all  x  E  and  k  >  0  we  have 


+  00 

U  F'^{V(x,k))  D  F’^’-ix  +  Vfc)  =  F”‘’ix)  +  F"‘(Vfc)  =  6’^. 

n=:0 

□ 

We  prove  the  “if”  part  of  Theorem  9  using  Lemma  1 1  and  the  power  series 
representation  of  CA.  Lemma  12  establishes  the  result  for  the  special  case  in 
which  the  cardinality  of  is  a  prime  power,  while  Lemma  13  proves  the  result 
in  the  general  case. 

Lemma  12.  Let  A(X)  =  J2-r<i<r  denote  a  finite  fps  over  Zpk  (p  prime). 
Suppose  there  exist  two  coefficients  ai,aj  such  that  gcd(p,  —  gcd(p,  Uj)  ==  1, 
and  let  n  be  any  multiple  of  p^(k  -  1)!.  Then,  for  each  fps  C{X)  we  can  find 
B(X)  =  B(X)A^  {X)  =  C(X)  (mod  p^)  and 

b-[n/2\  =  ^-[n/2j  +  l  =  '  '  '  =  ^L”/2j~2  =  ^[n/2j-l  =  0. 

Due  to  limited  space  we  do  not  report  the  proof  of  Lemma  12  here  (see  [11]). 

Lemma  13.  Let  A(X)  E-r<i<r  denote  a  finite  fps  over  Zm-  Suppose 
that  for  each  prime  p  which  divides  m  there  exist  two  coefficients  ai,aj  such  that 
gcd(p,af)  =  gcd(p,  a^  )  =  1.  Then,  for  any  integer  z  >  0  there  exists  n  such  that 
for  each  fps  C(X)  -  J^i^Z  ^  -  YlieZ 

6_,+i  =  .  •  ■  =  6,_2  6,_i  0,  and  B(X)A^(X)  =  C{X)  (mod  m).  (16) 

Proof  Let  w.  =  .  Qi  =  F? b  and  k  =  max,-  ki.  Let  n  denote  a  multiple 

of  mlk  - 1) !  such  that  n  >  2z.  Clearly  n  is  a  multiple  of  qi(ki  - 1) !  for  i  =  1,. .  .,h. 
By  Lemma  12  we  know  that  given  C{X)  we  can  find  Bi{X)  =  Ylj^Z 
that 

=  . .  .  =  =  0,  and  Bi(X)A’' {X)  =  C{X)  (mod  7.) 

Since  gcd[qiy7n/qi)  =  1,  we  can  find  Pi  such  that  Ppva/qi)  =  1  (mod  qi).  Let 

B{X)=j^pXBi{X). 

.=1 


For  i  =  we  have  B{X)  =  Bi{X)  (mod  7,).  Hence,  B(X)A"{X)  = 

C{X)  (mod  7.)  for  all  i,  which  implies  (16).  □ 
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Proof  of  Theorem  9  The  “if’  part  follows  directly  from  Lemmas  11  and  13.  To 
prove  the  “only  if”  part  we  use  again  the  power  series  representation.  Let  A{X)  - 
y  aiX^  denote  the  finite  fps  associated  to  the  map  F,  and  assume  there 

/L^-r<i<r  I  ^ 

exist  a  prime  p  and  an  integer  j  such  that  p|m  and  p\ai  for  all  i  ^  j-  Let  , 
-rn  <  i  <  rn,  denote  the  coefficients  of  A^{X).  It  is  straightforward  to  verify 
that,  for  i  /  jn,  we  have  that  Consider  now  any  configuration  6  G  Vi- 

The  corresponding  fps  B[X)  —  such  that  6o  =  0.  We  have 

rn 

i=  —  rn 

Since  6o  =  0,  all  terms  in  the  summation  are  multiple  of  p  and  p|  [F^{b)](nj). 
Hence,  the  configuration  c  such  that  c(^)  =  1  for  all  i  G  Z  clearly  does  not  belong 
to  T"(Vi),  and  by  (15)  T’  cannot  be  strongly  transitive.  □ 
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Recognizability  Equals  Definability  for  Partial 

A:- Paths* 


Valentine  Kabanets 

School  of  Computing  Science,  Simon  Fraser  University,  Vancouver,  Canada 


Abstract.  We  prove  that  every  recognizable  family  of  partial  fc-paths 
is  definable  in  a  counting  monadic  second-order  logic.  We  also  show  the 
obstruction  set  of  the  class  of  partial  A:-paths  computable  for  every  k. 


1  Introduction 

In  1960,  Biichi  [1]  showed  that  a  language  is  regular  iff  it  is  definable  by  some 
formula  in  a  monadic  second-order  logic,  MS.  Here,  MS  is  the  extension  of  the 
first-order  logic  that  allows  quantification  over  set  variables.  A  set  of  objects  is 
definable  by  an  MS-formula  if  the  formula  is  true  exactly  on  the  members  of  the 
set.  Thus  Biichi  established  that  recognizability  is  equivalent  to  MS-definability 
for  words.  Doner  [7]  then  extended  this  result  to  ranked  trees. 

Graphs  are  algebraic  objects  since  any  graph  can  be  constructed  from  smaller 
graphs  using  certain  graph  operations.  They  are  also  logical  structures  since  any 
graph  is  completely  determined  by  the  set  of  its  vertices  and  the  adjacency 
relation  on  this  set.  Thus  the  notions  of  recognizability  and  definability  can  be 
extended  to  finite  graphs.  Courcelle  [2]  proved  that  every  MS-definable  set  of 
finite  graphs  is  recognizable,  but  not  conversely.  However,  he  was  able  to  extend 
the  result  of  Doner  to  unordered  unbounded  trees  using  a  counting  monadic 
second-order  logic,  CMS,  an  extension  of  MS  that  allows  modular  counting. 

The  question  remained  whether  there  was  a  sufficiently  large  class  of  graphs 
for  which  recognizability  would  imply  CMS-definability.  In  their  study  of  graph 
minors,  Robertson  and  Seymour  [10]  introduced  the  notion  of  the  tree- width  of  a 
graph.  A  graph  of  tree- width  k  exhibits  certain  tree-like  structure.  Such  a  graph 
can  be  decomposed  into  subgraphs  of  size  k  1  arranged  as  nodes  of  a  tree 
(tree-decomposition)  so  that  the  nodes  containing  a  given  vertex  form  a  subtree. 

The  class  of  graphs  of  tree-width  at  most  k  coincides  with  that  of  partial  k- 
trees.  Among  other  classes  of  graphs  of  bounded  tree-width  are  trees  and  forests 
(tree- width  <  1),  series-parallel  graphs  and  outerplanar  graphs  (<  2),  and  Halin 
graphs  (<  3). 

The  class  of  graphs  of  bounded  tree- width  plays  an  important  role  for  another 
reason.  Courcelle  showed  in  [2]  that  the  MS-theory  of  the  class  of  partial  A?-trees 

*  This  research  was  done  while  the  author  was  at  Simon  Fraser  University  [8].  The 
author’s  present  address  is  Department  of  Computer  Science,  University  of  Toronto, 
Toronto,  ON,  Canada  M5S  3G4;  kabanets@cs.utoronto.ca. 
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is  decidable.  Seese  [11]  proved  that  if  the  MS-theory  of  a  class  of  finite  graphs 
g  is  decidable,  then  the  graphs  in  Q  have  uniformly  bounded  tree-width.  Thus, 
tree- width  “characterizes”  classes  of  finite  graphs  having  decidable  MS-theories. 

Strictly  speaking,  the  above  results  hold  for  so-called  MS2  logic,  where  MS2 
denotes  the  monadic  second-order  language  using  quantification  over  both  vertex 
sets  and  edge  sets  of  graphs;  MSi  is  the  language  that  uses  quantification  over 
vertex  sets  only  (see  [5,  6]).  In  this  paper,  we  are  using  MS2  and  CMS2- 

For  graphs  of  tree- width  at  most  k,  recognizability  is  defined  using  a  tree 
automaton  working  on  the  corresponding  tree-decompositions:  A  set  Q  of  partial 
fc-trees  G  is  recognizable  if  there  is  a  tree  automaton  that  accepts  any  tree- 
decomposition  of  each  graph  G  6  cind  rejects  tree-decompositions  of  graphs 
not  in  g.  Courcelle  [3]  showed  that  a  recognizable  set  of  partial  A:-trees  is  CMS- 
definable  for  A:  =  1  and  k  =  2,  and  conjectured  that  recognizability  implies 
CMS-definability  of  partial  A:-trees  for  every  k.  Kaller  [9]  proved  the  case  of 
k  =  Z  and  the  case  of  ^-connected  partial  A:-trees. 

We  establish  that  every  recognizable  set  of  partial  Ar-paths  is  C MS-definable, 
thereby  proving  a  special  case  of  Courcelle’s  conjecture.  A  partial  /:-path,  or 
graph  of  bounded  path-width,  is  a  partial  A:-tree  for  which  the  corresponding 
tree-decomposition  is  a  path-decomposition.  Partial  A:-paths  are  recognized  by 
finite  automata  working  on  the  corresponding  path-decompositions. 

Our  second  result  deals  with  computing  the  obstruction  sets  of  minor-closed 
graph  families.  The  class  of  partial  Ar-trees  (A;-paths)  is  minor-closed  and  its 
obstruction  set  can  be  determined  from  the  MS-formula  defining  that  class  [4]. 
We  describe  how  to  construct  the  MS-formula  defining  the  class  of  partial  k- 
paths  for  every  given  k.  As  a  consequence,  the  obstruction  sets  of  the  classes  of 
partial  Ar-paths  are  computable  for  each  k. 

The  remainder  of  this  article  is  organized  as  follows:  In  Sect.  2,  we  give 
the  necessary  definitions.  In  Sect.  3,  we  show  that  recognizability  implies  CMS- 
definability  for  a  generalization  of  the  class  of  A:-connected  partial  A:-paths,  the 
class  of  (A-,  l)-paths.  This  is  a  base  case  of  our  solution,  for  arbitrary  partial 
A^-paths  which  is  outlined  in  Sect.  4. 

2  Preliminaries 

2.1  Partial  fc-Paths 

We  consider  finite  and  simple  graphs  G  =  (V,E),  where  V  is  the  vertex-set 
and  E  is  the  edge-set  of  G.  A  path- decomposition  (or  decomposition)  of  G  is  a 
sequence  B  =  {Bi, . .  .,Bm}  of  vertex-subsets,  called  bags,  such  that 

1.  every  vertex  v  eV  belongs  to  some  bag  Bi  (1  <  i  < 

2.  for  each  edge  e^E,  there  is  a  (1  <  ^  <  m)  containing  both  ends  of  e, 

3.  for  any  i,  /,  i  G  {1,  • . . ,  m}  such  that  i  <  /  <  j,  H  Bj  C  Bi . 

The  path-width  of  a  decomposition  B  —  {Bi , . . . ,  Bm)  is  maxi<i<m{|5i|}  “  1- 
A  decomposition  of  path-width  at  most  k  will  be  called  a  k- decomposition.  The 
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path-width  of  a  graph  G  is  the  minimum  path- width  over  all  decompositions  of 
G.  A  parttal  k-path  is  a  graph  of  path- width  at  most  k. 

Example  1.  Graphs  Gi  (Fig.  1)  and  G2  (Fig.  2)  are  partial  1-path  and  2-path, 
respectively,  with  possible  decompositions:  B(Gi)  —  ({1,  2},  {2,  3},  {3, 4},  {3,  5}, 
{3,  6})  and  ^(Gs)  -  ({1, 1',  2},  {1, 2, 3},  {2, 3,4},  {2, 3,  5},  {2,  3,  6}). 


Fig.  1.  A  partial  1-path  Gi.  Fig.  2.  A  partial  2-path  G’2. 


For  a  partial  A*-path  G  =  {V,  E)  with  a  decomposition  B  =  {Bi, . .  .,Bm), 
first (f)  is  the  number  of  the  bag  where  a  vertex  v  ^  V  appears  for  the  first 
time,  i.e.,  first(i;)  =  mini</<^{/|i;  E  Bi],  new(Bi)  (i  E  is  the  set  of 

vertices  in  Bi  that  appear  in  the  decomposition  for  the  first  time,  i.e.,  new(Bi)  = 
{u  E  Bj|first(u)  =  z},  and  o\d{Bi)  is  the  set  of  vertices  in  Bi  that  also  appear  in 
some  earlier  bag,  i.e.,  o\d(Bi)  =  Bi\new{Bi). 

For  G  and  B  as  above,  a  vertex  u  ^  Br  <  r  <  m)  is  called  a  drop  vertex 
of  Br  iff  for  every  lu  £  V  \U^-iBi,  {w,  ^  E.  The  set  of  all  drop  vertices  of 

Br  (1  <  ^  is  denoted  by  drop(5r).  The  remaining  vertices  of  Br  are  called 
non-drop  vertices  of  Br^  the  set  of  which  is  denoted  by  non-drop(i?r)- 


2.2  CMS- Definability 

A  graph  G  =  B)  can  be  viewed  as  a  relational  structure  {VUE,  {p^ ,  Pe,  Inc}), 
where  p^  and  Pg  are  unary  predicates  that  define  the  vertex-set  and  the  edge-set, 
respectively,  and  Inc  is  the  ternary  incidence  predicate,  i.e.,  for  any  e  G:  E  and 
u,  E  V,  Inc(e,  u,  ?;)  =  True  iff  e  ==  {w,  i;}. 

The  language  of  counting  monadic  second-order  logic  corresponding  to  graphs 
G  has  the  usual  logical  connectives:  (“not”),  A  (“and”),  V  (“or”),  (“if- 

then”),  and  <^=>  (“if  and  only  if”),  universal  (V)  and  existential  (3)  quantifiers, 
equality  symbol  =,  a  sequence  u,  v,  w,  . . .,  of  individual  variables,  a  sequence 
U,  V,  W, . .  .,  of  set  variables,  the  membership  symbol  E,  the  unary  predicate 
symbols  mod^^g,  p  <  q  are  non-negative  integers,  and  the  predicate  symbols  p^, 
Pg,  and  Inc.  In  our  interpretation,  modp^g(V)  =  True  iff  |5|  =  p  mod  g,  where 
S  is  the  set  denoted  by  the  set  variable  V. 

A  graph  property  P  is  called  CMS-definable  over  a  class  of  graphs  Q  iff  there 
is  a  CMS-formula  such  that  for  each  G  £  Q,  G  satisfies  P  iff  ^  is  true  on  G. 

Example  2.  Connectedness  of  a  graph  G  is  an  MS-definable  property: 
Connected  =  VVi  VV2  (Vi  7^  0  A  V2  0  A  Vi  U  V2  V)  ^  Adj(Vi,  V2), 
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Adj(Vi,  V2)  =  3vi  3v2  ViGVi  a  V2€V2  a  adj(vi,V2), 
adj(vi,  V2)  =  3e  Inc(e,  vi,  V2), 
where  (V,-  7^  0)  =  3v  Pt,(v)  A  v  G  Vj  (2=1,2)  and 
(Vi  U  V2  =  V)  =  Vv  Py(v)  14-  (v  G  Vi  V  V  G  V2). 

Using  modo,2,  we  can  express  in  CMS  the  property  that  a  given  vertex  subset 
of  a  graph  has  even  cardinality.  This  cannot  be  done  in  MS  alone  [2]. 


2.3  Recognizability 

We  define  the  notion  of  recognizability  of  partial  /j-paths  in  terms  of  deterministic 
finite  automata  .A  =  {S,Q,S,qo,F)  working  on  extended  decompositions.  A 
decomposition  B  =  (Ri ,  Rf , . . . ,  Rm,  ^9")  is  caWed  extended  iff  dropping  old 
vertices  and  adding  new  vertices  occur  separately,  i.e.,  =  non-drop(Ri), 

1  <  2  <  m. 

Examples.  Here  is  an  extended  1-decomposition  of  the  graph  Gi:  B(Gi)  - 
{{1,2},{2),{2,3},{3},{3,4},{3},{3,5},{3},{3,6},{}). 

Let  G  -  {V,E)  he  Ei  partial  it-path  with  an  extended  /j-decomposition  B  = 
(Ri , . . . ,  Bm)-  Let  /?  :  U  {1, . . . ,  A?  +  1}  be  a  labeling  function  such  that  any 
two  distinct  vertices  in  the  same  bag  or  in  two  consecutive  bags  have  different 
labels.  We  call  such  labeling  functions  admissible  by  5.  It  is  not  difficult  to  see 
that  k  +  l  labels  always  suffice  in  the  case  of  extended  decompositions.  For  the 
labeling  function  (3  and  any  set  of  vertices  W  CV,  l3{W)  =  l3{w). 

For  B  and  P  described  above,  we  define  the  following  string  crpiB)  of  colored 
undirected  graphs  on  at  most  Ar -f  1  vertices:  cplB)  =  {c^piBi), . . .  ,(Tp{Bm)), 
where  for  a  bag  Bi  {I  <  i  <  m),  (rp{Bi)  =  {Vp{Bi),  Ep{Bi))  such  that  Vp{Bi)  = 
P{Bi),  and  for  every  G  Bi,  {P{u),p(u^)}  G  Ep{Bi)  iff  {22,12'}  G  E.  Let  Eg 
be  the  set  of  all  colored  (with  colors  1, . . . ,  A:  3- 1)  undirected  graphs  on  at  most 
k-\- I  vertices.  Clearly,  \Eg\  is  bounded  by  a  function  of  k. 

A  family  Q  of  partial  A:-paths  G  is  called  recognizable  iff  there  is  an  automaton 
A  with  the  input  alphabet  Eg  such  that  for  any  G,  G  £  Q  iff  o-piB)  G  L[A)  for 
any  extended  /^-decomposition  5  of  G  and  any  labeling  function  /?  admissible 
by  B,  and  G  ^  ^  iff  o-p(B)  ^  L{A)  for  any  B  and  P  as  above.  Here  L(A)  denotes 
the  language  accepted  by  A. 

3  The  Case  of  (fe,  i)-Paths 

3.1  (As,  i)-Paths  and  Aj-Generative  Orders 

A  connected  partial  Aj-path  is  called  a  [k,  l)-path  if  it  allows  a  ^-decomposition 
B  =  {Bi, ,  Bm)  satisfying  the  following  conditions: 

1.  o\d{Bi)  -  non-drop(J5i-i)  for  every  i  G  {2, . . . ,  m}, 

2.  drop(5i)  0  for  every  2  G  {1, . .  .,m}, 
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3.  |new(5f)|  =  1  for  every  i  ^  {2,  . .  .,m}. 

Here  (1)  says  that  vertices  are  dropped  from  a  bag  as  soon  as  possible,  (2) 
that  each  bag  contains  at  least  one  drop  vertex,  and  (3)  that  exactly  one  new 
vertex  is  added  to  form  the  next  bag.  Note  that  every  /^-connected  partial  /?-path 
is  a  {k,  l)-path. 

Example  4‘  The  graphs  Gi  and  G2  described  earlier  are  (k,  l)-paths. 

To  show  that  a  recognizable  family  Q  of  (k,  l)-paths  G  is  CMS-definable,  it 
suffices  to  define  in  CMS  some  extended  decomposition  for  every  G  and  then 
use  Biichi’s  result  for  sets  of  words.  A  decomposition  of  G  can  be  defined  if  some 
linear  order  on  V  is  known.  Let  <  be  an  arbitrary  linear  order  on  V,  and  let 
(ui, .  .  . ,  Vn)  be  the  sequence  of  vertices  in  V  ordered  according  to  <.  We  define 
the  sequence  B<  =  (Bi, .  . . ,  Bn),  where  Bi  =  {uj  U  {vj\j  <  i  and  there  is  f  > 
i  s.t.  {vj,Vji]  G  E}.  Clearly,  B<  is  a  decomposition  of  G.  For  a  partial  /?-path 
G,  a  linear  order  <  on  K  is  called  k-generative  if  is  a  Ar-decomposition. 
Conversely,  from  a  [k,  l)-decomposition  B  of  G,  one  can  define  a  Ar-generative 
linear  order  on  G  by  setting  u  to  be  less  than  v  iff  first(u)  <  first(?;),  u,v  eV, 
and  ordering  the  vertices  in  Bi  arbitrarily. 

Thus,  to  show  that  recognizability  implies  CMS-definability  for  (A:,  l)-paths, 
it  would  suffice  to  define  in  CMS  a  /^-generative  linear  order  for  every  given  [k,  1)- 
path.  However,  there  are  (Ar,  l)-paths  for  which  no  linear  order  can  be  defined  in 
CMS.  Consider  the  family  of  Gn  —  ({0, 1, . . . ,  n},  En),  where  En  =  < 

j  <  77.}.  No  linear  orders  can  be  CMS-defined  on  Gn,  since  these  graphs  have 
nontrivial  automorphisms,  and  the  size  of  Gn  can  be  arbitrary  large.  So,  in 
general,  we  cannot  CMS-define  a  A^-decomposition  of  a  partial  A^-path. 

For  a  partial  /?-path  G,  a  partial  order  on  V  is  called  k-generative  if  every 
completion  to  a  linear  order  on  V  is  /^-generative.  We  will  describe  a  certain 
A;-generative  partial  order,  which  is  MS-definable  over  a  suitably  colored  [k,  1)- 
path  G^.  Given  such  a  partial  order,  one  can  MS-define  a  tree- decomposition  of 
G  of  a  special  form.  Since  we  cannot  MS-define  a  path-decomposition  but  only 
a  tree-decomposition,  we  need  CMS  to  get  the  formula  for  recognizability  of 
G^,  using  an  extension  of  Biichi’s  theorem.  To  convert  the  corresponding  CMS- 
formula  into  a  formula  for  the  underlying  uncolored  (k,  l)-paths  G,  we  “guess” 
some  coloring  of  G  using  a  constant  number  of  3  quantifiers,  check  in  MS  if  it 
induces  the  required  structure,  and  apply  our  CMS-formula  to  the  colored  graph. 

To  MS-define  a  /^-generative  partial  order  on  a  (A;,  l)-path  G  with  a  (A?,  1)- 
decomposition  B  =  (Bi , . . . ,  Bm),  we  convert  G  into  the  directed  graph  G^  — 
(y,  E^)  using  the  following  algorithm.  For  a  bag  Br  =  o\.d(Br)  U  new(Br)  (1  < 
r  <  m),  where  o\d{Br)  —  {ui, .  ■  .,Us}  and  new(Br)  =  {u},  if  {v,Uj}  G  E,  then 
(u,  Uj)  G  E^ .  That  is,  we  direct  the  edges  from  new  to  old  vertices.  To  simplify 
the  notation,  we  will  often  omit  the  superscript  in  E^  and  the  subscript  in  cy 

Now  we  label  G^  as  follows.  For  v  G  new(Br)  and  every  u  G  o\d{Br)  0 
drop(5r)  (1  <  r  <  m),  we  color  the  arc  v  u  with  some  new  color.  This 
colored  arc  will  be  denoted  as  a  double  arrow  v  ^  and  the  set  of  them  as  E^ . 
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If  {i;}  =  ne\w{Br)  —  drop(Br),  we  color  v  with  some  new  color,  the  same  color 
for  all  such  vertices;  v  will  be  denoted  by  having  a  loop  arrow. 

Examples.  For  G2  defined  earlier,  the  (^,  l)-decomposition  B{G2)  induces  the 
labeled  digraph  G2  (Fig.  3). 


Fig.  3.  The  labeled  digraph  G2,  with  double  arrows  shown  as  thick  single  arrows. 


3.2  A  fc- Generative  Partial  Order 

Given  the  digraph  G^  induced  by  a  {k,  l)-decomposition  B  of  a  {k,  l)-path  G,  we 
define  the  following  binary  relation  of  strong  precedence,  denoted  by  on  the  set 
V:  for  any  u,  v  E  V,  u  ^  v  iff  either  (v,  u)  E  E  or  there  is  some  w  E  V  such  that 
(u,  iv)  E  E  and  (u,  w)  E  B-..  The  reflexive  and  transitive  closure  of  denoted 
by  -N,  is  called  precedence.  Semantically,  u  v  means  that  flrst(u)  <  first(u). 
We  extend  ■<  so  that  for  any  two  vertices  u  E  Bi  and  v  ^  B\  incomparable 
with  respect  to  u  is  less  than  v.  Let  denote  the  transitive  closure  of  that 
extension.  Obviously,  ::<Ms  a  ^-generative  partial  order  on  G. 

To  define  the  required  CMS-formula  for  recognizability  of  (^,  l)-paths,  we 
need  a  certain  refinement  of  We  color  G^  so  that  the  precedence  relation  ^ 
is  completed  to  a  linear  order  on  the  set  non-drop(Bi).  We  do  so  by  coloring  the 
non-drop  vertices  of  Bi  with  colors  1, . . . ,  A;  so  that  no  two  vertices  are  colored 
the  same.  We  denote  this  new  colored  digraph  by  G^^ 

Using  G^^  enables  us  to  define  the  following  k  sets  Pi, ... ,  Pk-  For  any  v  E 
V,  V  E  Pi  (1  <  i  <  A:)  iff  2  is  the  minimum  over  the  labels  of  the  vertices 
u  E  non-drop (Bi)  such  that  there  is  a  path  of  double  arrows  in  the  digraph  G^^ 
from  V  to  u.  The  set  N  of  nodes  is  defined  as  N  —  Uf-^Pi,  the  set  L  of  leaves  is 
defined  as  L  =  U\  (W  U  Bi). 

Example  6.  The  digraph  G2  from  Example  5  can  be  viewed  as  Gf^  with  the  two 
sets  of  nodes  Pi  =  {1,  3,  6}  and  P2  =  {2},  and  the  set  of  leaves  L  =  {4,  5}. 

Since  no  vertex  in  G^  can  have  more  than  one  incoming  double  arrow,  each 
set  Pi,  I  <  i  <  k,  induces  a  path  of  double  arrows  in  G^F  Therefore,  each  Pi  is 
linearly  ordered  by  Using  this  fact,  we  can  MS-define  a  A^-generative  partial 
order  on  G  that  is  a  linear  order  on  the  set  of  nodes  N .  We  denote  this  partial 
order  by  Note  that  we  could  MS-define  a  tree-decomposition  of  G  using  X'U 
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We  need  to  order  the  leaves  that  are  incomparable  with  respect  to  .  By 
the  definition  of  a  (A:,  l)-decomposition,  each  leaf  w  £  L  has  at  most  k  outgoing 
single  arrows  pointing  to  some  nodes  from  different  sets  Pi, . . .  ^  Pk-  For  a  leaf 
w  e  L,  P{w)  denotes  the  set  of  nodes  to  which  there  are  arrows  from  w,  i.e., 
P(iu)  =  {-u  6  6  -E*}-  We  associate  with  each  leaf  w  £  Lits  characteristic 

vector  xi'^)  =  •  •  •  >  where  for  each  1  <  i  <  Ar,  Xi(^)  =  1  if 

P(w)  n  Pi  ^  and  Xi(^)  =  0  otherwise.  We  extend  to  a  new  partial  order 
on  V,  denoted  by  ,  by  ordering  the  leaves  incomparable  with  respect  to 
lexicographically  according  to  their  characteristic  vectors. 

For  two  vertices  wi,W2  £  V,  we  say  that  wi  and  W2  are  p-equivalent,  denoted 
by  wi  ~  W2,  iff  wi,W2  £  L  and  P{wi)  =  P(w2).  For  the  quotient  graph  Gp  = 
G/^-  [Vp,  Ep)  we  extend  to  the  set  Vp  in  the  standard  way.  Clearly,  X”^  is 
a  linear  order  on  the  set  (NUL)/^.  Ordering  the  drop  vertices  of  Bi  arbitrarily 
yields  a  A;-generative  linear  order  on  Gp,  denoted  by  <p.  We  will  denote  the 
digraph  G^^  with  ordered  drop  vertices  of  Bi  by  G^^  . 

Example  7.  For  G2 ,  the  (Ar,  l)-decomposition  of  the  corresponding  quotient  graph 
is  5;  =  ({[1],  [1^  [2]},  {[1],  [2],  [3]},  {[2],  [3],  [4]},  {[2],  [3],  [6]})  [u]  denotes 

the  set  of  vertices  p-equivalent  to  u,  u  £  V . 

3.3  A  CMS- Formula 

Let  =  {B[, . . B'^)  be  the  {k,  l)-decomposition  of  the  graph  Gp  induced  by 
<p.  We  can  construct  a  (A:,  l)-decomposition  of  the  original  graph  G  as  follows. 
In  the  sequence  replace  with  Bi.  For  every  i  £  {1, . . .  ,m],  replace  B-  - 
{[ui]p  , . . . ,  [iisjp  ,  where  [u;]p  is  the  new  vertex  of  B-  such  that  [u;]p  = 

{uq, . . . ,  wt^}  {ti  >  1),  with  the  sequence  of  bags  B(wi)  —  {ui, . . . ,  Us^,  uJi}, . . . , 
B(wt^)  —  {ui, . . . ,  U5,,  Let  B'  denote  thus  constructed  decomposition  of  G. 

Examples.  For  G2,  two  decompositions  B'  are  possible:  ({1, 1',  2},  {1, 2,  3}, 

{2, 3, 4}, {2, 3, 5},  {2,  3, 6})  or  ({1,  F, 2), {1, 2, 3}, {2, 3, 5}, {2, 3, 4},  {2, 3,  6}). 

Let  us  convert  B'^  into  the  extended  decomposition  B^p  and  color  Gp  with 
some  labeling  function  Pp  :  Vp  ->{l,...,Ar+l}  admissible  by  B'p.  Let  us  also 
convert  the  decomposition  B'  of  G  into  the  extended  decomposition  B'  and  color 
the  graph  G  with  the  labeling  function  p  :  V  {1,...,A;  +  1}  such  that,  for  every 
V  £V ,  P{v)  =  )•  The  labeling  function  P  is  admissible  by  B*  since  no  leaf 

appears  in  two  consecutive  bags.  Note  that  the  symbols  in  the  alphabet  Eg  that 
correspond  to  the  bags  B'{wi)  and  B'[w2),  for  any  two  -^-leaves  wi  and  W2,  are 
identical.  Let  ap^(B'p)  =  {(ri,(Tif, . . .  Then  o-p{B')  can  be  obtained 

from  by  repeating  every  subsequence  {(Ti,  aip  {2  <  i  <  m)  |[u;]^|  times, 

where  new (5^-)  =  t>e  shown  that  (Tp^{B^p)  is  MS-definable. 

Let  A  =  (Eg,  Q,S,qo,  F)  be  the  automaton  recognizing  a  family  Q  of  (k,  1)- 
paths  G.  To  obtain  the  required  CMS-  formula  for  recognizability  of  we  use  an 
extension  of  Biichi’s  result  to  words  that  are  defined  as  sequences  of  substrings 
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given  with  their  multiplicities  (in  our  case,  the  sequences  (Tp^(B'p)  with  the 
cardinalities  of  the  corresponding  p-equivalence  classes).  By  finiteness  of  to 
determine  the  behavior  of  yl  on  a  substring  uj  repeated  t  times,  it  suffices  to 
know  t  mod  a  for  some  constant  a  dependent  on  A.  Therefore,  every  recognizable 
family  of  colored  [k,  l)-paths  is  C MS-definable. 

Let  0  be  the  CMS-formula  checking  the  recognizability  of  suitably  colored 
(k,  l)-paths.  We  state  without  proof  that  there  is  an  MS-formula  ^adm  verifying 
that  a  given  coloring  c  of  a  (/:,  l)-path  G  is  such  that  G  is  recognized  by  A 
iff  0  holds  for  G  colored  by  c.  Then  the  required  CMS-formula  for  uncolored 
(k,  l)-paths  G  is  the  following:  3  “coloring  c  of  G”  0a.dm{<^)  A  ^(G^^). 

Theorem  1.  Every  recognizable  family  of{k,  l)-paths  is  CMS- definable. 

4  The  General  Case 

4.1  Nice  Decompositions 

In  general,  a  partial  A:-path  is  not  necessarily  a  [k,  l)-path;  consider  the  partial 
2-path  Gn  from  Example  1  with  the  new  edge  connecting  vertices  4  and  5.  We 
generalize  our  definition  of  (Ar,  l)-decomposition  as  follows.  A  decomposition  B  = 
(5i , . . . ,  5^)  of  G  is  called  nice  iff  all  of  the  following  conditions  hold: 

1.  o\d{Bi)  =  non-drop(5j_i)  for  every  f  6  {2, . . . ,  m}, 

2.  drop(5i)  ^  0  for  every  i  £  {I, . .  .,m}, 

3.  for  any  2  G  {2, . . . ,  m},  if  |new(Bj)|  >  1,  then 

(a)  for  any  v  G  uyLj-new(5j),  each  decomposition  {Bi, . . . ,  Bi-i,old{Bi)  U 
{ij},  Gi, . . . ,  Cs)  of  G  is  such  that  drop(old(J3i)  U  {f})  =  0,  and 

(b)  for  any  subset  S  C  new(J3j),  each  decomposition  (Bi, . . . ,  old(5f)U 
S,  Gi, . . . ,  Cs)  of  G  is  such  that  drop(old(5i)  U  5)  =  0. 

Here  (1)  and  (2)  are  as  those  for  (k,  l)-decompositions,  and  (3)  says  that  if 
more  than  one  new  vertex  is  added  to  form  Bi,  then  both  (a)  there  was  no  single 
non- added  vertex  to  choose  instead  of  the  set  new(5f)  so  that  Bi  contained  a 
drop  vertex  and  (b)  new{Bi)  is  a  minimal  set  with  respect  to  set  inclusion  such 
that  Bi  contains  a  drop  vertex. 

It  is  not  difficult  to  show  that  every  Aj-decomposition  can  be  converted  into 
a  nice  Au-decomposition.  We  call  a  nice  A?-decomposition  B  =  {Bi,...,Bm)  a 
[k , p)- decomposition  for  some  1  <  p  <  Ar  iff  |new(5i)|  <  p  for  all  1  <  i  <  ?n.  A 
partial  A?-path  allowing  a  (A:,p)-decomposition  will  be  called  a  (k,p)-path. 

Let  5  —  (Bi, . . . ,  Bm)  be  a  nice  A?-decomposition  of  a  partial  Ar-path  G.  The 
family  of  sets  new  (5*)  (1  <  «  <  ?n)  forms  a  partitioning  of  the  vertex-set  V  of  G. 
We  call  the  corresponding  equivalence  on  V  the  1- equivalence,  denoted  by  The 
decomposition  B  also  induces  a  linear  order  on  the  quotient  set  Vj  denoted 
by  <1.  Clearly,  given  the  pair  (~,  <i),  we  can  reconstruct  the  decomposition  B 
of  G.  Although  we  can  MS-define  the  1-equivalence  when  G  is  suitably  colored, 
it  is  impossible  to  MS-define  <i. 
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We  will  divide  a  /^-decomposition  of  a  partial  /:-path  G  into  a  sequence  of 
monotonic  pieces  whose  structure  resembles  that  of  {k,  l)-decompositions.  For¬ 
mally,  a  contiguous  subsequence  (Bi, . . . ,  Bi+i)  (1  <  ^  <  m)  of  a  decomposi¬ 

tion  B  =  {Bi, . . . ,  Bm)  is  called  monotonic  iff  |new(Bi)|  >  1  and  |new(5r)|  =  1 
for  each  i  <  r  <  z  d-  /.  The  nice  decomposition  B  can  then  be  viewed  as  a  se¬ 
quence  of  monotonic  pieces  (Mi , . . . ,  Mj),  where  Mg  =  {Bi^ , . . . ,  Bj^)  for  each 
I  <  s  <  d.  Note  that  a  nice  decomposition  is  defined  so  that  it  is  monotonic  as 
long  as  possible,  then  there  is  a  “jump”  —  more  than  one  new  vertex  is  added 
to  a  bag  —  which  starts  a  new  monotonic  piece,  and  so  on. 

We  define  the  sets  new(Ms)  =  U^^j-^new(Br)  (1  <  s  <  d)  the  family  of  which 

forms  a  partitioning  of  the  vertex-set  V  oi  G.  The  corresponding  equivalence  on 

2  . 

V  is  called  2~equivalence  and  denoted  by  This  sequence  of  monotonic  pieces 
also  induces  a  linear  order  on  the  quotient  set  V^/  denoted  by  <2-  Some  k- 
decomposition  of  G  (possibly  different  from  B)  can  be  constructed  given 
and  <2-  Again,  we  can  MS-define  the  2-equivalence  on  a  suitably  colored  graph, 
but  not  <2- 


4.2  ^-Generative  Structures 

For  a  partial  A'-path  G,  a  triple  ,  <2),  where  ~  and  ~  are  equivalences 

2  ^ 

on  V  and  <2  is  a  linear  order  on  V/~  ,  is  called  a  linear  k-generative  structure 

on  G  iff  there  exists  some  nice  A-decomposition  5  of  G  such  that  ^  and  are 
the  1-equivalence  and  2-equivalence,  respectively,  induced  by  B,  and  <2  is  the 
linear  order  on  2-equivalence  classes  induced  by  B.  For  a  partial  A-path  G,  a 
1^2^  2^ 

triple  {~  ,i<y,  where  ~  and  ~  are  equivalences  on  V  and  ^2  ^  partial 

2  ^ 

order  on  ,  is  called  a  partial  k-generative  structure  on  G  iff  any  completion 
of  di2  a  linear  order  yields  a  linear  A-generative  structure  on  G. 

Let  i  and  ~  be  the  1-equivalence  and  2-equivalence,  respectively,  induced 

by  some  nice  A-decomposition  of  a  partial  A-path  G.  Let  -<  be  the  precedence 

2 

relation  defined  similarly  to  the  case  of  (A,  l)-paths,  and  let  ■<  be  the  extension 

2  .  12^ 

of  <  to  the  quotient  set  V/  ~  in  the  standard  way.  The  triple  (~,  is  not 

necessarily  a  partial  A-generative  structure  on  G.  One  reason  is  that  each  ~-class 
[w]  2  {u  E  V)  contains  several  vertices  all  of  which  must  be  put  in  the  same  bag. 
The  other  reason  is  that  [u]  2  can  “contribute”  more  non-drop  vertices  than  drop 
vertices.  We  did  not  have  the  latter  problem  in  the  case  of  (A,  l)-paths,  because 
there  adding  a  new  vertex  always  produced  at  least  one  drop  vertex. 

To  get  around  these  problems,  we  put  consecutive  monotonic  pieces  of  the 
A-decomposition  J5  of  G  into  sequences  of  minimal  length  such  that  the  num¬ 
ber  of  non-drop  vertices  produced  by  each  sequence,  except  the  first  one,  is 
at  most  that  of  drop  vertices.  More  formally,  let  p  =  {Mg, . .  .,Mt)  be  a  con¬ 
tiguous  subsequence  of  a  nice  A-decomposition  B  that  corresponds  to  the  se¬ 
quence  of  bags  {Bi^, . . . ,  Bj^).  We  define  the  balance  of  p,  bal(/i),  as  bal(/i)  = 
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|non-drop(Bj,)l  -  |old(Bi.)|.  A  contiguous  subsequence  p  of  monotonic  pieces 
is  called  balanced  if  bal(/j)  <  0  and  no  proper  non-empty  prefix  of  is  of  non- 
positive  balance. 

Let  B  =  where  M^,  I  <  s  <  d/is  a.  monotonic  piece.  We 

divide  B  into  disjoint  subsequences  of  monotonic  pieces  such  that 

B  =  iJi ...  ,  1^1  =  each  2  <  i  <  r,  is  balanced.  It  can  be  shown 

that  every  2  <  i  <  r,  corresponds  to  a  (k,k—  l)-subdecomposition  of  G.  The 
sets  new  (//,),  I  <i<  r,  defined  in  an  obvious  way  induce  a  partitioning  of  F .  The 
corresponding  equivalence  is  called  3i-equivalence  and  is  denoted  by  Recur- 
sively,  we  partition  each  fn,  I  <i<r,  into  and  define  32-equivalence 

classes.  Each  fij,  2  <  j  <  s,  corresponds  to  a  {k,k  —  2)-subdecomposition  of  G. 
We  stop  after  k  steps  when  every  (not  necessarily  balanced)  sequence  //  consists 
of  a  single  monotonic  piece  and  corresponds  to  a  (/?,  l)-subdecomposition  of  G, 
also  note  that  3A:-equivalence  coincides  with  2-equivalence. 

Then  we  define  partial  orders  on  these  3i-equivalence  classes,  denoted  by 

X  <1  i  satisfying  the  following  condition:  for  any  completions  of  ^  to 

lTnear"ord^s  <\  I  <  i  <  k,  such  that  <•?  is  a  refinement  of  <'  for  every  j  >  i 
(i.e.,  the  restriction  of  to  V/^  coincides  with  <*),  the  triple  is  a 

linear  il-generative  structure  on  G.  These  partial  orders  as  well  as  3i -equivalences 
can  be  MS-defined  for  suitably  colored  connected  partial  ^-paths  thanks  to  the 
properties  of  nice  decompositions. 

4.3  Defining  a  CMS-Formula 

We  partition  our  set  of  3i-equivalence  classes  into  the  sets  of  3i-nodes  and  3*- 

leaves,  I  <  i  <  k.  Then  we  refine  each  partial  order  1  <  i  <  /:,  to  a  linear  order 
on  the  set  of  3i-nodes  within  each  3i_i -equivalence  class;  every  two  vertices  of 
G  are  3o-equivalent.  However,  we  cannot  order  leaves  in  the  same  way  as  we  did 
in  the  case  of  {k,  l)-paths,  because  now  they  are  not  necessarily  single  vertices 
but  instead  correspond  to  sequences  of  bags,  and  hence  to  words  over  Sg. 

Let  A  =  {Ug,  Q,  S,  qo,  F)  be  an  automaton  recognizing  our  family  of  partial 
/c-paths.  We  call  two  incomparable  3f-leaves  within  the  same  3i_i-equivalence 
class,  I  <  i  <  k,  Si-equivalent  if  the  corresponding  words  cJi  and  CJ2  over  Ug 
are  such  that  for  each  q  G  Q,  S*{q,uJi)  —  S*{q^U2),  where  8*  is  the  extended 
transition  function  of  A.  To  determine  if  two  leaves  are  -equivalent,  we  need  to 
know  the  behavior  of  A  on  the  sequences  of  bags  corresponding  to  those  leaves. 

The  above  discussion  suggests  the  following  “bottom-up”  procedure  which 
can  be  encoded  in  CMS.  We  define  the  sequence  of  bags  corresponding  to  each 
3fc-equivalence  class  as  in  the  case  of  {k,  l)-paths,  since  each  3;;-equivalence  class 
is  the  set  of  new  vertices  of  a  monotonic  piece.  Then  we  convert  this  sequence 
into  the  word  u)  over  Ug  and  compute  the  behavior  of  A  on  uj.  This  behavior  is  a 
map  from  Q  to  Q,  which  can  be  presented  as  a  state-vector  q(iv)  of  length  IQ].  For 

each  3a:-i -equivalence  class  C,  two  3fc-leaves  and  C"  in  C/  ^  are  (5/; -equivalent 
iff  q{C')  =  q{G").  We  extend  the  partial  order  on  the  set  C/^  to  a  linear  order 
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on  Cs  =  (C/~)/~  by  ordering  incomparable  leaves  lexicographically  according 
to  their  state- vectors.  Let  (Ci, .  . . ,  Cs)  be  thus  ordered  sequence  of  elements  of 
Cs.  The  behavior  of  ^  on  C  is  defined  as  q(C)  =  qiCiY^  o  •  •  •  o  q(CsY%  where 
ti  =  \Ci\,  1  <  i  <  s,  and  o  is  the  composition.  By  finiteness  of  Q,  q{C)  can  be 
defined  in  CMS.  Continuing  in  this  manner  will  give  us,  after  k  steps,  the  vector 
q(G)  describing  the  behavior  of  A  on  the  entire  Ar-decomposition  of  G.  The  graph 
G  is  recognized  by  A  iff  q(G)  maps  go  to  some  final  state  of  A. 

Thus,  we  can  define  a  CMS-formula  for  recognizability  of  suitably  colored 
connected  partial  ^-paths.  As  in  the  case  of  (k,  l)-paths,  there  is  an  MS-formula 
^adm  recognizability  implies  CMS-definability  for  connected  partial  k- 

paths.  Note  that  the  formula  3  “coloring  c  of  G”  ^^adm(^)  on  G  iff  G  is  a 

partial  )t-path,  so  the  obstruction  set  of  the  class  of  partial  /?-paths  is  computable. 

For  a  disconnected  partial  Ar-path  G,  we  compute  the  state-vectors  for  its 
connected  components,  order  these  vectors  lexicographically,  and  compute  their 
composition  in  CMS.  Together  with  Courcelle’s  result  this  yields  our  main  claim. 

Theorem  2.  Recognizability  equals  definability  for  partial  k~paths. 
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Abstract.  The  maximum  number  of  strands  used  is  an  important  mea¬ 
sure  of  a  molecular  algorithm’s  complexity.  This  measure  is  also  called 
the  space  used  by  the  algorithm.  We  show  that  every  NP  problem  that 
can  be  solved  with  b{n)  bits  of  nondeterminism  can  be  solved  by  molec¬ 
ular  computation  in  a  polynomial  number  of  steps,  with  four  test  tubes, 
in  space  In  addition,  we  identify  a  large  class  of  recursive  algo¬ 

rithms  that  can  be  implemented  using  bounded  nondeterminism.  This 
yields  improved  molecular  algorithms  for  important  problems  like  3-SAT, 
independent  set,  and  3- color  ability. 


1.  A  model  of  molecular  computing 

Molecular  computation  was  first  studied  in  [1,  17].  The  models  we  define  were 
inspired  as  well  by  the  work  of  [3,  23].  A  molecular  sequence  is  a  string  over 
an  alphabet  S  (we  can  use  any  alphabet  we  like,  encoding  characters  of  U  by 
finite  sequences  of  base  pairs).  A  test  tube  is  a  multi-set  of  molecular  sequences. 
We  describe  the  allowable  operations  below.  Where  set  notation  is  applied  to 
multi-sets,  multiplicities  are  respected.  In  the  definitions  7i,  T2,  and  T3  denote 
distinct  test  tubes,  c  denotes  a  character,  and  i  denotes  a  positive  integer. 

Separate(Ti,  c,  i,  T2,  T3) 

T2  :=  the  multi-set  of  all  strings  in  Ti  whose  iih  character  is  c; 

Ts  :=  the  multi-set  of  all  strings  in  Ti  whose  Rh  character  is  not  c; 

Ti  0. 

Pour(Ti,  T2) 

T2  :=Ti; 

Ti  :=  0. 

Append(T',  c) 

T  :=  {xc  :  X  eT}, 
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CCR-8958528  and  CCR-9415410.  Email:  fu-bin@cs.yale.edu 
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Merge(Ti,r2,T3) 

Ts  :=ri  UTo; 

Ti  0; 

T2  -  0. 

Others  have  proposed  a  variant  of  operation  separate,  which  we  will  call  Sep. 
It  checks  whether  a  string  contains  the  character  c  anywhere.  If  we  represent  the 
?dh  symbol  Zi  of  a  string  z  by  the  symbol  [i,  Zi)  instead,  then  the  standard  Sep 
operation  can  simulate  our  Separate  operation  with  no  additional  overhead.  The 
use  of  polynomial-size  alphabets  is  standard  practice  in  molecular  computing. 
We  prefer  the  Separate  operation  for  convenience  in  programming. 

The  running  time  for  a  molecular  algorithm  is  proportional  to  the  number 
of  operations  on  test  tubes.  An  important  complexity  measure  is  the  solution 
space  size  (also  called  simply  space),  i.e.,  the  maximum  number  of  strings  in  all 
test  tubes  at  any  time,  counting  multiplicities.  Adleman  [2]  has  speculated  that 
molecular  computation  with  a  solution  space  of  size  2'^^  (about  0.002  moles) 
might  be  possible.  Recent  papers  [3,  19]  attempt  to  optimize  solution  space  size 
for  particular  combinatorial  problems. 

Problem  instances  are  associated  with  a  parameter  n  called  their  size.  In 
complexity  theory,  n  is  the  length  of  a  suitable  encoding  of  the  instance.  However, 
in  analysis  of  algorithms,  n  is  usually  a  more  natural  representation-independent 
parameter,  such  as  the  number  of  vertices  in  a  graph  or  number  of  variables 
in  a  formula.  Although  the  n’s  of  complexity  theory  and  the  n’s  of  analysis  of 
algorithms  are  usually  polynomially  related,  it  can  make  a  phenomenal  difference 
when  n  appears  in  the  exponent.  For  that  reason  we  take  n  to  be  a  problem- 
dependent  but  representation-independent  notion  of  size  through  this  paper.  We 
write  I  a;  I  to  denote  the  size  of  a  problem  instance  x  rather  than  its  length,  and 
we  usually  identify  n  with 

We  consider  a  highly  restricted  model  of  ^(n)-time,  s(n)-space  molecular 
computation,  which  we  think  has  a  good  chance  of  eventually  being  practical. 
On  input  x,  one  test  tube  Tq  is  initialized  to  hold  encodings  of  the  numbers 
1, . .  . ,  s(|a^|).  A  sequence  of  molecular  operations  oi, .  . . ,  oq[a;|)  =  f(x)  is  then 
performed,  where  /  is  a  conventional  polynomial-time  computable  function  (that 
is,  the  program  is  uniform  in  a  weak  but  appropriate  sense).  The  computation 
accepts  if  Tq  is  nonempty  after  the  last  operation  is  performed.  MOL(s(n))  is 
the  class  of  languages  accepted  by  such  a  computation  where  the  running  time 
t{n)  is  polynomial  bounded. 

We  give  the  most  space-efficient  molecular  algorithms  known  for  several  prob¬ 
lems.  See  Table  1. 

2.  Bounded  Nondeterminism 

NP  computation  with  a  limited  amount  of  nondeterminism  was  introduced 
in  [14,  15,  16]  and  studied  further  in  [10,  11,  20,  9,  12,  25,  13,  7].  The  class 
NPbits(6(n))  consists  of  all  languages  recognized  by  an  NP  machine  that  make 
at  most  6(n)  binary  nondeterministic  choices  on  each  computation  path  on  in¬ 
puts  of  size  n.  (Actually,  prior  treatments  allowed  O  (6(n))  binary  choices,  but 
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Results 

1  Previously 

1  In  This  Paper 

Problem 

Space 

Limited  Model 

Reference 

Space 

Limited  Model 

Hamiltonian  Path 

77.! 

V 

[1] 

SAT 

2” 

V 

[17] 

QBF 

2” 

X  X 

[23] 

3-SAT 

1.62" 

X 

[19] 

1.50" 

V 

3-Colorability 

1.89" 

V 

[3] 

1.35" 

V 

Independent  Set 

1.51" 

V 

[3] 

1.23" 

V 

(3,  2)-system 

1 

1.39" 

V 

Table  1.  Results  for  particular  problems 


the  constant  factor  turns  out  to  be  very  important  in  connection  with  molecu¬ 
lar  computation.)  We  define  a  refinement  of  these  classes:  NPinit(s(n))  consists 
of  all  languages  recognized  by  NP  machines  that  nondeterministically  choose  a 
number  between  1  and  ^(n)  on  inputs  of  size  n  and  then  behave  deterministically. 
Clearly,  NPbits(6(72))  =  NPinit(2^(^)). 

3.  NPinit(s(n))  C  M0L(5(n)) 

In  this  section  we  show  how  to  simulate  bounded  nondeterministic  computa¬ 
tion  via  bounded-space  molecular  computation.  Results  of  this  type  appear  in 
[4,  23,  24,  29],  but  they  assume  models  of  molecular  computation  with  more  pow¬ 
erful  operations,  such  as  Amplify,  that  may  be  harder  to  implement  in  practice. 
Independently,  Boneh  et  al.  [8]  obtained  a  result  similar  to  ours. 

Lemma  1.  Let  tt  be  a  circuit  with  m  gates.  Given  a  tube  To,  a  molecular  al¬ 
gorithm.  using  only  the  operations  Pour,  Append,  and  Merge,  running  in  tim.e 
0{m),  and  using  only  four  test  tubes  can  create  tubes  Ti  and  T2  such  that  Ti 
contains  all  strings  z  from,  tube  To  that  satisfy  7r(z)  =  1  and  T2  contains  all 
strings  z  from,  tube  To  that  satisfy  7r{z)  —  0, 

Proof  Let  tt’s  input  gates  be  gi,  . .  S'lid  internal  gates  be  gn+i,  •  •  •  ?  5'm  in 
topological  order;  in  particular  gm  is  the  output  gate.  We  will  use  four  tubes 
To,Ti,T2,T3.  For  each  i,  let  gi  compute  fi(gj(i),  gk(i))  where  j{i)  <  i,  k(i)  <  i, 
and  fi  is  a  binary  function.  We  perform  the  following  algorithm: 

for  i  :=  77.  -b  1  to  m  do 

Separate(To,  0,i(z),Ti,T2) 

Separate(Ti ,  0,  k(i),TQ,  T3) 

Append(To,  .A(0,  0)) 

Append(T3,/,:(0, 1)) 

Merge(To,T3,Ti) 

Separate(T2, 0,  k{i),  To,  T3) 

Append(To,/i(l,0)) 
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Append(T3, /,;(!,  1)) 

Merge(ro,T3,T2) 

Merge(Ti,T2,Tc)) 

Separate(T’o,  0,  m,  Ti,  T2) 

At  completion,  Ti  contains  all  strings  that  satisfy  tt  and  T2  contains  all  strings 
that  do  not  satisfy  tt.  I 

Theorem 2.  NPinit(s(n))  C  MOL(s(n)). 

Proof.  Let  L  be  accepted  by  an  NPinit(s(7i))  machine  M.  Construct  a  determin¬ 
istic  machine  M'  that  takes  as  inputs  a  string  x  and  a  positive  integer  z  <  s(n) 
and  accepts  iff  M  accepts  input  x  with  nondeterministic  guess  z.  Obtain  by 
fixing  the  input  x,  so  the  only  input  to  is  the  number  z.  Construct  a  circuit 
TT  equivalent  to  M'  in  the  usual  way  (see  [21]).  Apply  Lemma  1  to  tt  to  see  that 
L  is  in  MOL(s(n)).  I 

4.  Implementing  Recursion  with  Bounded 
Nondeterminism 

In  this  section  we  show  how  to  enumerate  search  spaces  using  bounded  non¬ 
determinism.  In  many  nondeterministic  searches,  some  paths  are  longer  than 
others,  which  can  be  inefficient.  However,  if  we  can  compute  the  size  of  subtrees, 
then  we  can  balance  nondeterministic  search  trees,  which  reduces  the  amount  of 
nondeterminism  needed. 

Recursive  algorithms  for  NP  problems  usually  take  the  form  of  d-self- 
reductions  (“d”  for  disjunctive).  Self-reductions  were  defined  in  [27]  and  d-self- 
reductions  were  defined  in  [28]. 

Definitions.  Let  \y\  denote  the  size  of  the  problem  instance  y.  A  partial  order 
X  is  polynomial  well-founded  if  there  exists  a  polynomial-bounded  function  p 
such  that 

-  Vm  < - <  yi  ^  m  <  pi\yi\) 

-  ym  < - <yi^\ym  \  <p{\yi\) 

For  technical  simplicity  we  will  consider  only  languages  L  containing  the 
emptystring,  A. 

Definition  4.  A  d-self-reduction  for  a  language  L  consists  of  a  polynomial  time 
computable  function  h(x)  =  {ici, . . . ,  Xm}  a  polynomial-well-founded  partial 
order  -<  on  problem  instances  such  that 

-  A  is  the  only  minimal  element  under 

-  for  all  /  A,  x  e  L  h{x)  H  L  0 

-  for  all  X,  Xi  E  h{x)  ^  Xi  <  x 

Definitions.  Let  (h,  -<}  be  a  d-self-reduction  and  let  a;  be  a  problem  instance. 
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-  Tk^^ix)  is  the  unordered  rooted  tree  that  satisfies  the  following  rules:  (1)  the 
root  is  x\  (2)  for  each  y,  the  set  of  children  of  y  is  h{y). 

-  number  of  leaves  in  Th,\{x). 

If  (h,  is  a  self-reduction  for  L,  then  the  corresponding  recursive  algorithm 
for  L  runs  in  time  \Th,^(x)\.  The  analysis  of  such  an  algorithm  usually 

provides  a  bound  on  \Th,^(x)\  that  is  suitable  for  use  in  constructing  a  molecular 
algorithm  for  L.  We  formalize  this  below: 

Definition  6.  Let  T  be  a  polynomial-time  computable  function.  A  language  L 
is  in  REC(T(a:))  if  there  is  a  d-self-reduction  {h,  ■<}  for  L  such  that  for  all  x 

(1)  \Th,^{^)\<T{x),^nd 

(2) 

Lest  conditions  (1)  and  (2)  above  seem  restrictive,  we  argue  that  they  are 
quite  natural.  We  consider  the  typical  analysis  of  a  recursive  algorithm.  One 
introduces  a  function  T  and  proves  by  induction  on  |a?|  that  <  T{x), 

which  is  (1).  The  inductive  hypothesis  is  that  \Th,^{w)\  <  T(w)  if  |i«;|  <  |x|. 
Inspection  of  the  algorithm  yields 

Th,^(xi) 

Xreh{x) 

<  ^  T{xi)  by  the  inductive  hypothesis 

XiEh{x) 

The  last  step  in  the  induction  consists  of  showing  that  T  satisfies  Ylxiehix)  ^(^0  ^ 
T{x),  which  is  (2).  The  only  other  requirement  on  T  is  that  T  be  polynomial-time 
computable.  We  will  deal  with  that  later  in  this  section. 

The  function  T  above  depends  on  problem  instances  rather  than  their  size 
because  the  analysis  of  the  algorithm  may  depend  on  two  or  more  parameters. 
We  will  need  an  analogous  variant  of  NPinit(). 

Definition  7.  NPinit'(5'(a;))  consists  of  languages  recognized  by  NP  machines 
that  nondeterministically  choose  a  number  between  1  and  S{x)  on  input  x  and 
then  behave  deterministically. 

Clearly,  if  5(a;)  <  s{\x\)  then  NPinit'(5(a^))  C  NPinit(s(n)). 

Theorems.  REC(T(a:))  C  NPinit'(T(a^)). 

Proof.  Let  L  G  REC(T(x))  via  {h,  x).  We  will  define  a  deterministic  polynomial¬ 
time  computable  function  path(2,  a;)  taking  values  in  {0,1,  A}  such  that 
path(l,ic)  •  •  •path(T(a;),  x)  is  equal  to  the  sequence  of  values  at  the  leaves  of 
Th  in  canonical  order.  The  proof  is  completed  by  having  the  ith  path  of  an 
NPinit'(T(.x))  machine  compute  path(2,  x);  clearly  that  machine  accepts  L.  The 
function  path(z,  x)  will  be  computed  via  tail  recursion. 
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function  path(?’,  x) 

if  X  =  A  then  return  true 

else  if  h{x)  =  0  then  return  false 

else 

{a^i, .  .  . ,  Xrn]  •—  h(^x) 
for  j  1  to  m  do 

if  i  <  T{xj)  then  return  path(i,  Xj) 
else  i  :=  i  —  T{xj) 
return  A 


I 


Now  we  give  sufficient  conditions  for  T  to  be  polynomial-time  computable. 

Definition 9.  We  say  that  a  partial  order  ^  on  problem  instances  is  parame- 
iertzable  if  there  are  a  function  m  from  problem  instances  to  a  set  M,  a  partial 
order  on  M,  and  a  polynomial  p  such  that 

-  7n{x)  is  computable  in  time  polynomial  in  |a;|,  and 

-  X  ^  y  m(x)  -<’  and 

-  III?:  :  i  ■<’  m(:L*)}||  <  p(|a?|). 

In  many  examples  we  will  take  m{x)  =  |a;|  and  to  be  the  standard  lin¬ 
ear  order  on  natural  numbers.  In  other  examples,  m{x)  will  be  a  tuple  of  pa.- 
rameters  (such  as  the  number  of  2-clauses  and  the  number  of  3-clauses  in  a 
Boolean  formula);  in  many  (but  not  all)  of  these  examples  we  use  the  partial 
order  (oi , . . . ,  a^)  -<'  (6i, . . . ,  6jfc)  if  (Vi) [a*  <  hi]  and  (3z)[af  <  bi]. 

Definition  10.  Given  h  and  m,  define 

-  mh(x)  =  the  multi-set  {m(a!«)  :  Xi  G  h{x)} 

-  MH(2:)  =  the  set  {mh(y)  :  m{y)  =  m(ic)} 

Definition!!.  A  d-self-reduction  (h,^)  is  by  cases  if  is  parameterizable  via 
(m,  -<')  in  such  a  way  that  MH(a^)  is  computable  in  time  polynomial  in  |a:|. 

Lemma  !2.  Lei  {h,  -<}  be  a  d-self~reduciion  by  cases  with  parameter  funcUon 
77t().  Lei  To  be  the  least  function  T  such  that 

(1)  \nA^)\<T{x) 

(s) 

(3)  T{x)  is  a  function  of  m{x) 

Then  To  exists  and  To(x)  is  computable  in  time  polynomial  in  Ixj. 

Proof  Let  (h,-<)  have  a  parameterization  where  m(ic)  and  MH(a?)  are 

computable  in  time  polynomial  in  |a:|.  Define  a  partial  function  i  from  M  to 
natural  numbers  recursively: 

{1  if  p  is  a  minimal  element  under 

otherwise 


822 


If  /,/.  is  in  the  range  of  in,  then  t{fi)  is  defined  because  MH(x)  is  a  finite  set  for 
every  x.  Now  it  is  easy  to  see  that  t  o  m  is  the  least  function  satisfying  (1,2,3). 

By  Definition  9,  |{i  :  i  m(a;)}l  <  p{\x\).  If  we  compute  i(m(a?))  by  the 
obvious  recursion,  at  most  p(|a:|)  different  subproblems  will  arise.  If  we  use  a 
table  to  avoid  recomputation,  the  recursion  will  run  in  polynomial  time.  I 

4.1.  3-SAT 

In  tliis  section,  we  apply  our  results  to  the  classic  3-SAT  algorithm  of  Monien  and 
Speckenmeyer  [18]  and  a  recent  unverified  3-SAT  algorithm  of  Schiermeyer  [26]. 
The  former  yields  a  simple  MOL(1.62^)  algorithm,  and  the  latter  (assuming  that 
Schiermeyer’s  paper  is  correct),  yields  a  MOL(1.497”)  algorithm. 

Monien  and  Speckenmeyer’s  Algorithm  The  size  of  a  satisfiability  instance 
is  the  number  of  variables.  Consider  the  3-SAT  algorithm  of  Monien  and  Speck¬ 
enmeyer.  Let  f\i  denote  the  formulas  obtained  by  replacing  in  /  the  literal  £ 
by  true  and  i  by  false.  A  ^-clause  is  a  disjunction  of  k  literals.  The  function 
3SAT  takes  a  formula  /  consisting  of  some  3-clauses  and  at  least  one  1-clause 
or  2- clause. 

function  3SAT(/) 

if  /  is  the  empty  set  of  clauses  then  return  true 
else  if  /  contains  an  empty  clause  then  return  false 

else  if  some  variable  v  appears  only  in  positive  literals  then  return  35Ar(/|^,) 
else  if  some  variable  v  appears  only  in  negative  literals  then  return  ^SAT{f\y) 
else  if  /  contains  a  clause  C  consisting  of  a  single  literal  £  then  return  35AT(/|^) 
else  if  /  contains  a  clause  C  consisting  of  two  literals  £i ,  £2  then 
return  SSAT{f\iJ  V  ^SAT{f\j^\i^) 

else 

let  V  be  the  first  variable  to  appear  in  / 
return  3SAT{f\y)  V  3SAT(f\v) 

The  last  case  in  the  recursion  is  ostensibly  the  worst,  yielding  two  subprob¬ 
lems  of  size  n  -  1,  but  it  only  occurs  on  the  first  call  or  immediately  after 
eliminating  a  single  variable,  which  yields  a  single  subproblem  of  size  n  -  1; 
unrolling  the  recursion,  we  see  that  the  last  case  gives  two  subproblems  of  size 
77-2.  The  worst  case  is  the  second  to  the  last,  which  yields  subproblems  of  size 
77  —  1  and  77  —2.  Thus  the  number  of  leaves  in  the  self-reduction  is  at  most  2/(77) 
where  /(77)  is  given  by  the  recurrence  f{n)  =  f{n  —  1)  H-  f{n  —  2);  in  particular 
2/(77)  <  1.62’^  for  almost  all  n. 

The  algorithm  above  is  clearly  a  d-self~reduction  for  3-SAT.  The  value 
function  h  for  a  formula  is  the  set  of  subformulas  generated  by  the  recur¬ 
sive  algorithm.  Let  m(^x)  =  n,  where  n  is  the  number  of  variables  in  the  for¬ 
mula  <  is  the  normal  order  for  the  integers.  From  the  analysis  above  we 
know  mh(a!)  is  either  {77  —  1},{77  —  2,77  —  2}  or  {77  —  2, 77  —  1}.  MH(a:)  is 
{{77- 1),  {77.-2,77-2},  {77-2,  77-  1}}  that  is  clearly  polynomial  time  computable. 
Let  /(77)  -  1.62”.  2/(77)  <  ^(77).  Hence  ^(?7.)  is  an  upper  bound  of  the  number 
of  leaves  of  computation  tree  for  the  recursive  algorithm.  It  is  easy  to  see  that 
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t{n)  >  i{n-2)-\-t(n-l)  >  t(n-2)+t{n-2).  Hence,  Tssix)  =  ^(m(aO)  satisfies 
the  conditions  of  Lemma  12  and  3-SAT  is  in  REC(To(-F'))  for  some  To  <  Tss-  By 
Theorem  8  and  Theorem  2,  3SAT  E  NPinit(^3s(n)),  so  3'SAT  is  in  MOL(1.62"). 
The  same  space  bound  for  3-SAT  was  obtained  previously  by  Ogihara  [19],  but 
in  a  model  that  allows  more  powerful  operations  like  Polymerization,  which  can 
implement  the  Amplify  operation. 

Schierineyer’s  Algorithm  Schiermeyer  [26]  reports  a  1.497”  time  algorithm 
for  3-SAT  problem.  His  algorithm  is  a  d-self-reduction  for  the  3-SAT  problem. 
We  will  prove  that  3SAT  E  REC(T(T)),  where  the  function  T(T)  <  1.497 
and  wdll  be  defined  below.  We  follow  [26]  to  define  Fs  and  F^.  For  a  formula 
F  with  n  variables,  let  p  be  the  maximum  number  of  1-clauses  and  2-clauses 
(with  preference  of  Tclauses)  such  that  no  variable  occurs  more  than  twice. 
Let  q  be  the  number  of  remaining  2-clauses  and  define  m  -  p  +  mm{2,q).  Let 
Fsin)  =  •  fr  and  F^{n)  =  ,  where  /?  =  1.4963,  a  =  1.04855  and  c  is  a 

sufficiently  large  constant. 

/  ^  1-clauses  or  2-clauses 

^  ~  I  -^3(71)  otherwise 

Schiermeyer  states  that  F^{n)  >  \Th^^{F)\  if  F  has  at  least  one  1-clause  or 
2  clause,  and  that  Fs{n)  >  \Th,^{F)\  for  all  F.  Hence,  T{F)  >  \Th,^\,  since 
\Th,^  \  <  the  number  of  recursive  calls.  The  inequalities  that  Schiermeyer  gives 
in  the  proofs  of  his  Lemma  4.3  and  Lemma  4.4  imply  that  our  T{F)  satisfies  the 
conditions  of  Definition  6.  Hence,  3SAT  E  REC(T(T))  C  MOL (1.497”). 

4.2.  3-Coloring  and  (3,  2)-System 

Beigel  and  Eppstein  [6]  give  algorithms  for  (3,2)-system  and  3-coloring.  In  the 
(fl,6)-system  problem,  we  are  given  a  collection  of  n  vertices,  each  of  which 
can  be  given  one  of  a  different  colors.  However  certain  color  combinations  are 
disallowed:  we  are  also  given  a  set  of  constraints,  each  of  which  forbids  one 
coloring  of  some  6-tuple  of  variables.  (3,  2)-system  generalizes  3-coloring,  3-SAT 
and  3-edge-coloring. 

(3,  2)-System  Algorithm  The  size  of  a  (3,  2)  system  is  the  number  of  variables 
in  it.  Beigel  and  Eppstein’s  [6]  (3,  2)-system  algorithm  can  be  sketched  as  follows; 

function  32SYS(T) 

if  |T|  <  5  then  return  brute-force(T) 

else 

(Ti,...,T,)-/^32(T) 

return  Vr=i  32SYS(Ti)' 

In  the  algorithm  above,  brute-force(T)  means  “use  the  brute  force  method 
to  solve  the  (3,  2)-system  Ff  k  <  3;  h^2  is  polynomial-time  computable;  and 
\Fi  \  <  \F\.  Let  h  -  hs2  and  let  ^  be  the  standard  linear  ordering  on  the  natural 
numbers.  Then  (h,  is  a  d-self-reduction  for  (3,  2)-system.  Define  m(T)  :=  |T|. 
In  case  1,  mh(T)  =  {77.  -  (4  +  i),  n  -  1},  where  i  >  0. 


824 


In  cases  2a,  2c,  and  3,  mh(F)  =  {n  -  (3  +  f),  n  -  2},  where  i  >  0. 

In  cases  2b,  2d,  6,  8c  and  9,  mh(i^)  =  {n'},  where  n'  <  n. 

In  case  4,  mh(F)  =  {n  —  5,  n  —  3,  n  —  3}. 

In  case  5,  mh(F)  =  {n  —  4,  n  —  4}. 

In  case  7,  mh(F)  =  {n  —  3,  n  —  3}. 

MH(F)  is  polynomial-time  computable  by  the  case  analysis  above.  Let  t(n)  = 
1.38028".  It  is  easy  to  see  that  for  every  input  x  with  {ni, . . . ,  -  mh(a?)  and 

ni{F)  =  n,  t{n)  >  t{ni)  -b  •  •  •  -b  t(nk).  Define  T{F)  =  t{m(F)).  By  Lemma  12, 
there  is  a  polynomial-time  computable  function  To  such  that  To(T)  <  T{F) 
and  To,  h,  and  satisfy  the  conditions  of  Definition  6.  Thus,  (3,  2)-system  is  in 
REC(To(T)).  So, 

(3,  2)-system  G  NPinit'(To(T))  by  Theorem  8 

C  NPinit'(T(T))  because  To(T)  <  T(F) 

—  NPinit(t(n))  because  T(T)  =  ^(|T|) 

=  NPinit(l. 38028")  because  t{n)  =  138028" 

C  MOL(1.38028")  by  Theorem  2. 

3-Coloring  Algorithm  There  are  two  parts  to  Beigel  and  Eppstein’s  algo¬ 
rithm.  The  first  part  runs  in  polynomial-time  and  finds  an  independent  set  S 
with  a  lot  of  neighbors.  Let  F{S)  denote  the  set  of  vertices  in  G  that  are  not 
in  S  but  are  adjacent  to  an  element  of  S,  The  second  part  3-colors  S  in  all 
possible  ways.  Each  of  these  31*^1  partially-colored  graphs  is  transformed  into  an 
equivalent  (3,  2)-system  with  n  -  l^l  -  |T(5)|  variables,  which  is  solved  by  call¬ 
ing  32SYS.  Their  algorithm  runs  in  time  31*^1 1.38028"“!‘^l"t^^‘^^l,  which  is  less 
than  1.345"  for  sufficiently  large  n.  Thus  we  have  the  following  NPinit(  1.345") 
algorithm: 

choose  a  natural  number  m  <  1.345" 
construct  Beigel  and  Eppstein’s  set  S 
let  c  =  m  mod 

color  S  with  the  cth  3-coloring  in  the  lexicographical  ordering 
form  the  corresponding  (3,2)-system  F 
let  6  =  [7n/3l'^*J 

run  32SYS(T)  using  the  nondeterministic  choices  dictated  by  b 

Therefore  3-coloring  is  in  MOL(1.345"). 

4.3.  Independent  Set 

For  a  graph  G,  an  independent  set  5  is  a  subset  of  G’s  nodes  such  that  there  is 
no  edge  between  any  two  nodes  in  S.  The  independent  set  problem  is  “given  a 
graph  G  and  a  number  k,  does  G  contain  an  independent  set  of  cardinality  at 
lest  kV 
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Tarjan’s  Algorithm  Consider  the  following  simple  algorithm  due  to  Ta.r- 
jan  [30].  (d{v)  denotes  the  degree  of  u,  and  N{v)  denotes  the  neighbor  set  of 
V.  max(5,T)  denotes  the  larger  of  the  two  sets  S  and  T,  with  ties  resolved 
arbitrarily.) 

function  MIS(G) 
pick  any  vertex  v  in  G 

if  d{v)  <  1  then  return  {r;}  U  MIS(G  -  v  -  N{v)) 
else  return  max(MIS(G  —  ?;),  {u}  U  MIS(G  —  u  —  N{v))) 

This  is  a  self-reduction  with  at  most  T(n)  leaves  where  T(n)  satisfies  T{n) - 
—  1)  +  T{n  -  3)  where  T(n).  The  recurrence  can  be  solved  in  polynomial 
time  by  an  explicit  formula  or  by  dynamic  programming  so  the  independent  set 
problem  is  in  MOL(1.47"),  which  is  better  than  prior  results  [3].  Because  the 
algorithm  is  particularly  simple,  the  molecular  algorithm  can  even  be  made  to 
run  in  linear  time. 

Robson’s  Algorithm  The  best  published  purely  recursive  algorithm  for  the 
independent  set  problem  is  due  to  Robson  [22]  and  runs  in  time  1.229”^  for  suf¬ 
ficiently  large  n.  A  d-self-reduction  with  1.229’"  leaves  is  evident  from  Robson’s 
paper,  so  we  have  we  have  a  MOL(1.229’")  algorithm  for  the  independent  set 
problem.  Details  will  be  given  in  the  full  version  of  this  paper. 

Robson  has  a  faster  dynamic  programming  algorithm  for  independent  set,  but 
we  see  no  way  to  adapt  it  to  molecular  computing.  Molecular  computing  may 
motivate  the  search  for  efficient  recursive  algorithms  that  do  not  use  dynamic 
programming.  Towards  that  end  we  have  found  a  recursive  1.223  time  (for 
sufficiently  large  n)  algorithm  for  independent  set  [5]  that  is  based  on  a  d-self- 
reduction  and  hence  is  directly  adaptable  to  molecular  computing. 
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Abstract.  The  construction  of  evolutionary  trees  is  a  fundamental 
problem  in  biology,  and  yet  methods  for  reconstructing  evolutionary  trees 
are  not  reliable  when  it  comes  to  inferring  accurate  topologies  of  large 
divergent  evolutionary  trees  from  realistic  length  sequences.  We  address 
this  problem  and  present  a  new  polynomial  time  algorithm  for  recon¬ 
structing  evolutionary  trees  called  the  Shovt  Quartets  Method  which  is 
consistent  and  which  has  greater  statistical  power  than  other  polyno¬ 
mial  time  methods,  such  as  Neighbor-Joining  and  the  3- approximation 
algorithm  by  Agarwala  et  al.  (and  the  “Double  Pivot”  variant  of  the 
Agarwala  et  al.  algorithm  by  Cohen  and  Farach)  for  the  Loo -nearest 
tree  problem.  Our  study  indicates  that  our  method  will  produce  the  cor¬ 
rect  topology  from  shorter  sequences  than  can  be  guaranteed  using  these 
other  methods. 


1  Introduction 

Evolutionary  trees  indicate  how  species  evolved  from  a  common  ancestor  and  are 
of  fundamental  concern  to  biologists.  There  are  many  methods  for  reconstruct¬ 
ing  trees  from  biomolecular  sequences,  and  all  potentially  competitive  methods 
are  evaluated  according  to  their  accuracy  for  topology  prediction  [11].  However, 
reconstructing  this  topology  is  a  difficult  task  for  at  least  two  reasons.  First, 
all  accepted  optimization  problems  in  this  area  are  NP-hard,  so  that  methods 
which  are  efficient  typically  do  not  provide  good  performance  on  large  sets  of 
sequences.  More  importantly,  even  if  we  could  solve  some  of  the  NP-hard  op¬ 
timization  problems  in  this  domain,  the  sequence  length  required  in  order  to 
be  able  to  guarantee  an  accurate  topology  estimation  can  be  beyond  what  is 
available  or  even  possible.  A  polynomial  time  algorithm  that  can  only  be  guar¬ 
anteed  to  be  accurate  on  unavailable  sequence  lengths  is  simply  not  reliable, 
and  it  must  either  not  be  used,  or  if  used  its  output  must  not  be  believed.  On 
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the  other  hand,  a  method  which  is  accurate  on  realistic  length  sequences  can 
be  used  even  if  it  requires  more  computational  resources.  We  may  simply  need 
to  use  more  machines,  wait  longer,  employ  more  sophisticated  techniques  to  im¬ 
plement  the  same  basic  objective,  etc.  Thus,  the  sequence  length  needed  by  a 
method  imposes  a  significantly  more  severe  limitation  than  its  computational 
requirements.  The  importance  to  biologists  of  this  measure  of  accuracy  (called 
efficiency  or  power  in  the  systematic  biology  literature  [14])  is  reflected  in  the 
extensive  performance  analysis  literature  in  systematic  biology  in  which  meth¬ 
ods  are  analyzed  according  to  their  performance  on  model  tree  reconstruction 
under  various  stochastic  models  of  evolution  [12].  Initially  these  studies  focused 
on  consistency  [7],  i.e.  the  question  of  whether  a  method  would  be  guaranteed  to 
produce  the  correct  topology  given  long  enough  sequences.  Since  the  discovery 
around  1970  [13]  of  consistent  distance  transformations  (which  produce  ‘^cor¬ 
rected  distances”)^  it  has  been  clear  that  all  reasonable  distance-based  methods 
can  recover  the  true  tree  with  high  probability  given  long  enough  sequences  when 
applied  to  corrected  distances  computed  on  sequences  generated  by  binary  trees. 
All  this  is  well-understood  in  the  systematic  biology  community.  What  is  not  so 
well-understood  is  the  sequence  length  needed  to  obtain  an  accurate  topology 
with  high  probability  using  a  given  method  on  a  given  model  tree.  Unfortunately, 
sequence  lengths  are  limited,  and  especially  so  when  the  tree  to  be  reconstructed 
is  large  and  contains  widely  divergent  sequences. 

This  paper  contains  several  results: 

-  We  present  a  probabilistic  analysis  of  the  depth  and  diameter  of  random  trees 
under  two  distributions. 

-  We  describe  a  framework  based  upon  topology-invariant  neighborhoods  which 
permits  the  comparison  of  the  statistical  power  of  different  distance-based 
tree  reconstruction  methods. 

-  We  develop  a  new  consistent  polynomial  time  method,  the  Short  Quartet 
Method  for  reconstructing  evolutionary  trees,  and  provide  an  analytical  study 
of  its  convergence  rate  for  inferring  trees  under  the  Cavender-Farris  model. 
(This  analysis  extends  to  a  large  class  of  r-state  Markov  models.)  We  show 
that  this  method  has  superior  statistical  power  to  Neighbor- Joining,  the  most 
popular  distance-based  method  of  phylogenetic  tree  reconstruction,  and  to 
new  results  from  the  theoretical  computer  science  community  by  Agarwala 
et  al  (STOC  1996)  [1]  and  Cohen  and  Farach  (SODA  1997  and  RECOMB 
1997)  [5]. 

Due  to  space  constraints,  we  cannot  give  proofs  in  this  extended  abstract. 

2  Basics 

We  begin  by  describing  a  simple  model  of  sequence  evolution,  called  the 
Cavender-Felsenstein  model,  or  sometimes  the  Cavender-Farris  model.  The 
Cavender-Felsenstein  model  of  evolution  for  binary  sequences  associates  to  ev¬ 
ery  edge  e  in  a  model  tree  T  a  mutation  probability  Pe  with  0  <  Pe  <  -5,  and 
the  mutations  on  each  edge  are  independent.  The  sites  (i.e.  positions  within 
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the  sequences)  are  assumed  to  evolve  identically  and  independently,  with  the 
state  at  the  root  selected  according  to  some  distribution  (usually  uniform).  If 
k  sites  evolve  under  this  model,  then  the  tree  generates  a  set  of  sequences  of 
length  k  at  the  leaves.  We  allow  the  input  to  our  method  to  be  any  symmetric 
zero-diagonal  non-negative  matrix,  and  we  will  abuse  the  notation  and  call  such 
matrices  distance  matrices. 

Definition!.  A  distance  matrix  D  is  additive  if  and  only  if  there  exists  a 
tree  T  with  non-negative  edge  weighting  w  such  that  for  all  leaves  ij,  Dij  = 
Y2eeP  -  where  Pij  is  the  path  between  i  and  j  in  T.  The  Loo  distance  be¬ 
tween  two  distance  matrices  A  and  B  is  defined  by  Loo  (A,  —  maxij  \Aij  —Bij\. 

The  \joQ-nearest  tree  problem  takes  as  input  a  distance  matrix  d  and  returns  an 
additive  distance  matrix  D  minimizing  Loo{d,  D).  The  6 -neighborhood  around  d, 
denoted  N{d,S),  is  the  set  of  all  distance  matrices  d'  such  that  Loo{d,d')  <  6. 
A  distance-based  method  M  for  phylogeny  construction  is  a  mapping  from  nxn 
distance  matrices  to  n  x  n  additive  distance  matrices.  A  tree  Ti  is  said  to  refine 
a  tree  T  if  T  can  be  obtained  from  Ti  by  contracting  some  of  the  edges  in  Ti . 
A  method  M  is  said  to  be  combinatorially  consistent  if  M{D)  ~  D  for  all  ad¬ 
ditive  distance  matrices  D,  and  continuous  at  D  if  for  every  e  >  0  there  exists 
a  (5  >  0  such  that  if  d  €  N{D,S)  then  M(d)  €  W(M(Z)),e).  We  will  say  that  a 
distance-based  method  is  reasonable  if  it  is  both  combinatorially  consistent  and 
continuous  at  every  additive  distance  matrix  defining  a  binary  tree. 

An  interesting  characterization  of  additive  matrices  D  is  the  following: 

Theorem  2.  Four  Point  Condition,  from  [4]:  A  distance  matrix  D  is  an  additive 
matrix  if  and  only  if  for  all  i,j,  k,  I,  of  the  three  pairwise  sums  Dij  A  D^hDik  + 
Dji,Dii  ADjk,  the  largest  two  are  identical. 

The  proof  of  the  theorem  shows  that  the  ordering  on  the  three  pairwise  sums 
indicates  the  topology  induced  by  the  quartet.  Thus,  if  Dij  A  Dki  is  strictly 
smaller  than  the  other  two  sums,  then  the  topology  induced  by  the  quartet 
i,j,k,l  is  a  resolved  binary  tree;  otherwise  all  three  sums  are  identical,  and  the 
topology  induced  by  iJ,  /c,  /  is  a  star.  Since  we  assume  that  T  is  binary,  all  such 
quartets  induce  resolved  subtrees.  We  will  denote  this  topology  by  ij\kl  when 
the  pairs  that  are  separated  by  an  internal  edge  are  ij  and  kl. 

We  now  present  a  characterization  of  additive  distance  matrices  which  define 
the  same  topology. 

Theorems.  Two  additive  distance  matrices  D  and  D'  define  the  same  topol¬ 
ogy  if  and  only  if  for  all  quartets,  the  relative  orders  of  the  pairwise  sums  for 
that  quartet  are  identical  in  the  two  matrices.  Therefore,  for  every  reasonable 
distance-based  method  M  and  for  every  binary  tree  T  defining  additive  distance 
matrix  D,  there  will  be  a  6  >  0  such  that  M  is  guaranteed  to  reconstruct  the 
topology  of  T  when  applied  to  any  d  G  N{D,S).  Consequently,  any  reasonable 
distance-based  method  M  will  be  consistent  on  every  binary  tree  when  applied 
to  corrected  distances.  However,  for  every  edge-weighted  tree  T  with  minimum 
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edge  weight  x,  there  is  a  tree  T'  with  a  different  leaf-labelled  topology  such  that 
Loo{D,D')  -  xl2,  where  D  is  the  additive  distance  matrix  for  T  and  D'  the 
additive  distance  matrix  for  T' . 

We  will  now  describe  a  method  we  call  the  Naive  Method,  based  on  Bune- 
man’s  Four-Point  Condition.  For  each  quartet  of  species  compute  the 

topology  on  that  quartet  by  computing  the  three  pairwise  sums  (this  is  called 
the  four-point  method  (FPM)  for  reconstructing  a  tree  on  a  single  quartet.)  If 
the  three  sums  are  distinct  and  the  minimum  is  attained  at  Dij  +  Du,  then  set 
the  topology  on  i,  j,  k,  I  to  be  ij\kL  If  the  minimum  sum  is  not  unique,  constrain 
the  topology  to  be  a  star.  Construct  the  tree  (if  it  exists)  consistent  with  all  the 
constraints  on  the  topologies  of  quartets.  If  no  tree  exists  consistent  with  all  the 
constraints,  output  a  star  tree.  (A  similar  procedure  was  described  by  Fitch  in 
[9].)  Constructing  a  tree  consistent  with  all  quartet  topologies  is  easily  done  in 
polynomial  time  through  a  variety  of  techniques,  hence  this  is  a  polynomial  time 
method. 

We  now  present  a  comparison  of  various  distance  based  methods  based  upon 
topology  invariant  neighborhoods. 

Theorem  4.  Let  D  be  an  additive  n  x  n  distance  matrix  defining  a  binary  tree 
T ,  d  be  a  fixed  distance  matrix,  and  let  S  =  Loo{d,D).  Assume  that  x  is  the 
minimum  weight  of  internal  edges  of  T  in  the  edge  weighting  corresponding  to 

D. 

(i)  A  hypothetical  exact  algorithm  for  the  Loo-nearest  tree  is  guaranteed  to  return 
the  topology  of  T  from  d  if  6  <  a:/4. 

(ii)  (a)  The  Z- approximation  algorithm  for  the  L^-nearest  tree  is  guaranteed  to 
return  the  topology  of  T  from  d  if  6  <  x/8.  (b)  For  all  n  there  exists  at  least  one 
d  with  S  ^x/6  for  which  the  method  can  err.  (c)  If  6  >  x/A,  the  algorithm  can 
err  for  every  such  d. 

(iii)  The  Naive  Method  is  guaranteed  to  return  the  topology  of  T  from  d  if  5  < 
x/2,  and  there  exists  a  d  for  any  5  >  x/2  for  which  the  method  can  err. 

In  other  words,  given  any  matrix  d  of  corrected  distances,  if  an  exact  al¬ 
gorithm  for  the  Loo-nearest  tree  can  be  guaranteed  to  correctly  reconstruct  the 
topology  of  the  model  tree,  then  so  can  the  Naive  Method.  Thus,  an  exact  al¬ 
gorithm  for  the  Loo-nearest  tree  can  err  on  longer  sequences  than  the  Naive 
Method,  when  applied  to  corrected  distances,  for  any  model  tree  T .  This  sug¬ 
gests  an  inherent  limitation  of  the  Loo-nearest  tree  approach  to  reconstructing 
evolutionary  tree  topologies. 

3  The  Short  Quartet  Method 

The  Short  Quartet  Method  is  similar  in  spirit  to  the  Naive  Method,  in  that 
it  is  based  upon  reconstructing  trees  for  quartets,  and  then  combining  these 
trees  if  possible.  However,  the  essential  difference  is  that  we  attempt  to  avoid 
reconstructing  the  trees  for  the  difficult  quartets.  Instead,  we  attempt  to  con¬ 
struct  topologies  only  on  those  quartets  that  are  close  within  the  tree;  these 
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are  called  the  short  quartets.  The  reconstruction  of  the  tree  from  these  short 
quartets  involves  solving  a  special  case  of  a  problem  which  is  in  its  general  form 
NP-complete  [15].  The  method  we  use  to  reconstruct  the  topology  on  each  quar¬ 
tet  is  not  specified;  if  we  can  afford  the  time,  we  may  elect  to  use  maximum 
likelihood  which  has  great  statistical  power,  but  which  is  computationally  too 
expensive  to  use  for  all  but  small  trees.  However  we  do  not  know  apriori  which 
quartets  are  short  quartets.  Thus,  the  method  we  actually  employ  is  a  greedy 
method,  which  surprisingly  can  be  shown  to  have  high  probability  of  accurate 
reconstruction  of  the  topology  provided  that  the  sequence  length  is  adequate, 
even  if  we  reconstruct  topologies  on  quartets  using  the  same  (simple  and  not 
particularly  statistically  powerful)  method  used  by  the  Naive  Method! 


3.1  Short  Quartet  Consistency 

We  begin  by  defining  the  notion  of  an  edi-suhtree. 

Definitions.  The  topological  distance  between  two  leaves  i  and  j  in  a  tree  T 
is  the  number  of  edges  on  the  path  between  i  and  j,  and  the  topological  length 
of  a  path  P  is  the  number  of  edges  on  P.  Consider  the  subtrees  of  a  binary 
T  obtained  by  deleting  a  single  edge  e  in  T  but  not  the  endpoints  of  e;  call 
such  subtrees  edi-subtrees  (for  edge-deletion-induced) .  Each  such  edi-subtree  can 
be  considered  a  rooted  tree,  by  rooting  it  at  the  endpoint  of  e  to  which  it  was 
originally  attached.  Given  an  edi-subtree  t,  rep(t)  denotes  a  leaf  in  t  closest  to 
the  root  of  t.  Two  edi-subtrees  which  are  disjoint  and  whose  roots  are  distance  2 
apart  are  said  to  be  sibling  edi-subtrees.  In  order  to  simplify  the  discussion,  we 
may  abuse  the  notation  and  let  t  also  denote  the  leaf  set  of  the  edi-subtree  t. 

We  give  some  more  definitions. 

Definition  6.  Let  the  depth  of  an  edi-subtree  in  T  be  the  number  of  edges  on 
the  path  from  e  to  the  nearest  leaf,  and  let  the  depth  of  T  (denoted  by  d(T))  be 
the  maximum  depth  of  any  edi-subtree  in  T.  We  say  that  a  path  P  in  the  tree 
T  is  short  if  its  length  is  at  most  2d{T)  -f  2.  The  quartet  i,i,  /c,  I  is  said  to  be  a 
short  quartet  if  it  induces  a  subtree  which  contains  a  single  edge  connected  to 
four  disjoint  short  paths. 

Thus,  the  depth  of  a  complete  binary  tree  of  n  leaves  is  log2  n  -  1  but  the 
depth  of  a  caterpillar  (a  tree  consisting  of  a  long  path  with  leaves  hanging  off 
the  path)  is  just  1.  Consequently,  every  quartet  in  a  complete  binary  tree  on  n 
leaves  is  a  short  quartet,  but  there  are  only  0{n)  short  quartets  in  a  caterpillar. 

We  now  proceed  with  the  description  of  the  algorithm  which  we  will  use  to 
construct  binary  model  trees  from  a  set  of  topologies  on  quartets.  Our  algorithm 
operates  by  determining  siblinghood,  first  of  leaves,  and  then  of  larger  and  larger 
rooted  edi-subtrees,  until  the  tree  is  constructed  from  the  leaves  inward.  The 
determination  of  siblinghood  of  edi-subtrees  is  based  upon  detecting  witnesses 
and  anti- witnesses  among  the  quartets,  which  we  now  define. 
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Definition?.  Given  a  quartet  {i,  j, /c,/}  of  leaves,  we  will  denote  by  ij\kl  the 
induced  topology  on  I  in  which  i  and  j  are  separated  in  T  from  k  and  I  via 
a  path.  Let  ti  and  t2  be  two  ed^subtrees.  A  witness  to  the  siblinghood  of  ti  and 
t’l  is  a  short  quartet  {u^v,w,x}  with  topology  uv\wx  such  that  u  €  ti,  u  6  ^2, 
and  {u;,ar}n(ti  Ut2)  =  0-  We  call  such  quartets  witnesses.  An  anti-witness  to  the 
siblinghood  of  ti  and  t2  is  a  short  quartet  {p,  s}  with  topology  pq\rs,  such 
that  p  e  ti,  r  e  t2,  and  {q,  s)  fl  {ti  U  t2)  =  0.  We  will  call  these  anti-witnesses. 

We  now  present  the  property  upon  which  the  algorithm  is  based: 

Axiom  1  Let  ti  and  t2  be  disjoint  edi-subtrees  of  T  and  assume  T  —  ti  —t2  has 
at  least  two  leaves.  Then  t\  and  t2  are  siblings  if  and  only  if  the  following  two 
conditions  hold: 

1.  There  are  leaves  y  and  z  such  that  the  quartet  {rep{ti),rep{t2),y,  z}  is  a 
witness  to  the  siblinghood  of  ti  and  t2,  and 

2.  If  there  is  an  antiwitness  to  the  siblinghood  of  ti  and  t2,  then  there  is  a 
witness  for  it  as  well. 

This  axiom  provides  the  basis  for  determining  if  there  is  at  least  one  tree 
consistent  with  the  constraints  in  the  set  of  quartets,  but  may  not  be  enough 
to  verify  that  there  are  not  two  such  trees.  Verifying  uniqueness  of  the  solution 
turns  out  to  be  easy,  fortunately,  but  it  is  also  necessary  due  to  the  way  in  which 
we  selectively  apply  the  short  quartet  consistency  algorithm. 

In  each  edz-subtree,  there  may  be  more  than  one  leaf  that  is  closest  to  the  root 
of  the  subtree  (in  terms  of  the  number  of  edges  on  the  path  from  the  leaf  to  the 
root).  However,  among  all  such  closest  leaves  in  each  edi-subtree,  there  is  a  unique 
leaf  which  has  a  smallest  label,  if  the  species  are  labelled  by  1, 2, ...,  n.  We  call  this 
leaf  the  smallest  representative  of  the  edi-subtree.  This  allows  us  to  define 
a  special  set  of  short  quartets,  which  we  call  the  representative  quartets,  as 
follows.  Each  short  quartet  is  composed  of  a  single  edge  e  —  (a,  6),  so  that  if  we 
delete  both  a  and  b  from  T  we  create  four  edi-subtrees.  We  will  say  that  a  short 
quartet  is  a  representative  quartet  if  its  leaves  are  the  smallest  representatives 
of  the  four  edi-subtrees  created  in  this  manner.  Then  the  following  can  be  shown: 

Theorems.  If  a  binary  tree  T  is  consistent  with  a  set  Q  of  quartet  topologies 
such  that  Q  contains  all  representative  quartets,  then  T  is  uniqely  consistent  with 

Q- 

This  observation  and  the  axiom  above  suggests  the  following  algorithm: 

”  Start  with  every  leaf  of  T  (i.e.  the  taxa)  defining  an  edi-subtree. 

-  While  the  graph  has  more  than  three  edi-subtrees,  do: 

•  Form  the  graph  on  vertex  set  given  by  the  edi-subtrees,  and  with  edge 
set  defined  by  siblinghood;  i.e.,  {x,  y)  is  an  edge  if  and  only  if  edi-subtrees 
X  and  y  satisfy  the  conditions  of  Axiom  1  for  siblinghood. 
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*  Make  a  sibling  pair  out  of  each  connected  component,  and  make  the 
roots  of  the  edi-subtrees  in  that  connected  component  children  of 
a  common  root  r,  and  replace  the  pair  of  edz-subtrees  by  one  edi- 
subtree. 

*  If  no  new  sibling  pairs  are  found,  then  return  fail 

•  If  there  are  at  most  three  edz-subtrees  left,  connect  their  roots  each  to 
one  internal  node,  and  call  the  resultant  tree  T . 

-  Verify  that  T  satisfies  all  the  constraints  given  in  the  input,  and  that  Q 
contains  the  representative  quartet  for  every  edge  in  T .  If  so,  return  T,  and 
else  return  fail 

The  correctness  of  this  algorithm  follows  from  the  discussion  above,  and  the 
runtime  of  this  algorithm  depends  upon  how  the  two  edi-subtrees  are  found  that 
can  be  siblings.  It  is  obvious  that  this  can  be  achieved  in  polynomial  time,  but 
the  details  of  the  implementation  are  omitted  due  to  space  constraints. 

Theorem  9.  Given  a  set  Q  containing  all  short  quartets  of  a  tree  T  and  satis¬ 
fying  Axiom  1,  we  can  determine  T  in  0(|Q|logn  T  n^logn)  time. 


3.2  The  entire  method 

We  now  describe  how  we  use  the  short  quartet  consistency  algorithm  to  construct 
the  tree.  One  issue  we  address  is  how  we  select  the  set  of  quartets  to  consider. 
As  it  turns  out,  this  is  done  in  a  greedy  fashion,  which  we  now  describe: 

Definition  10.  We  define  the  similarity  between  sequences  i  and  j  to  be 
_  1  _  2H(iJ)/k,  where  k  is  the  sequence  length,  and  H{iJ)  is 
the  Hamming  distance  of  sequences  i  and  j.  Let  Q  be  the  set  of  all  pos¬ 
sible  quartets  on  [n],  and  let  Qw  b®  those  quartets  a,  6,  c,  d  such  that 
min{s(a,  fo),s{a,c),s(a,  d),s(6,c),s(6,d),s(c,d)}  >  w. 

On  a  given  set  Qw^  result  of  applying  the  Short  Quartet  Consistency  algo¬ 
rithm  will  either  be  a  binary  tree  that  is  uniquely  consistent  with  all  the  topology 
constraints  in  Qw,  or  fail  This  permits  us  to  define  our  method  as  follows.  The 
structure  of  the  method  is  to  do  a  “halving”  search  among  the  w  by  applying  the 
Short  Quartet  Consistency  algorithm  to  starting  with  w  =  1/2, 1/4,  etc., 
until  we  either  find  a  tree  that  is  uniquely  consistent  with  the  Short  Quartet 
consistency  algorithm  or  realize  that  no  such  tree  can  be  found  (this  evidence 
of  failure  occurs  when  w  <  l/k).  We  can  show  that  with  high  probability,  given 
adequate  sequence  length  this  search  will  examine  a  set  Qw  which  contains  all 
short  quartets  and  which  also  satisfies  Axiom  1.  Consequently,  in  polynomial 
time  we  will  reconstruct  the  tree  topology. 

Theorem  11.  The  Short  Quartets  Method  takes  0(71^^  lognlog/c -h  n^/c)  time  in 
the  worst  case.  On  any  input  d  of  distances  derived  from  sequences  generated  on 
a  model  tree  T,  if  the  Naive  Method  accurately  reconstructs  the  topology  of  T 
from  d  then  SQM  will  also  accurately  reconstruct  the  topology  ofT  from  d. 
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A  more  realistic  analysis  of  the  running  time  of  the  Short  Quartet  Method  is 
based  upon  analyzing  typical  trees  can  be  obtained  by  using  Theorem  13.  Typical 
trees  under  both  the  uniform  and  Yule-Harding  distributions  have  O(loglogn) 
depths.  If  the  pe  probabilities  on  the  edges  of  a  tree  of  depth  O (log  log  n)  are 
equal  or  almost  equal,  then  certain  Qyj^s  with  \Qu}\  =  0{n  polylogn)  will  yield  a 
tree  through  the  consistency  algorithm,  and  the  halving  search  will  hit  such  a  rc, 
with  probability  1  —  o(l).  Consequently,  for  typical  tree  shapes  and  for  mutation 
probabilities  that  just  slightly  vary,  applying  the  Short  Quartet  Method  is  likely 
to  take  only  0{n^k  +  n?  logn)  time. 

We  now  state  our  main  result: 


Theorem  12.  Suppose  k  sites  evolve  under  the  Cavender-Farris  model  on  a 
binary  tree  T,  so  that  for  all  edges  e,  Pe  €  [f,g],  where  we  allow  f,g  to  be 
functions  of  n.  Assume  that  g  is  separated  from  1/2.  The  Short  Quartet  Method 
returns  the  tree  T  with  probability  1  —  0(1),  if 


k  > 


c  •  log  n 

(1  - 


(1) 


where  c  is  a  fixed  constant. 


4  Depth  vs.  Diameter  of  Random  Trees 

We  have  shown  that  the  sequence  length  needed  by  our  method  depends  expo¬ 
nentially  upon  the  minimum  of  the  depth  or  the  diameter  of  the  tree  it  attempts 
to  reconstruct.  We  study  these  topological  quantities  in  this  section. 

Two  simple  models  for  describing  semi-labelled  binary  trees  are  the  uniform 
model,  in  which  each  tree  has  the  same  probability,  and  the  Yule-Harding  model, 
studied  in  [2,  3,  10].  This  distribution  is  based  upon  a  simple  model  of  speciation, 
and  results  in  “bushier”  trees  than  the  uniform  model. 

The  following  results  are  needed  to  analyse  the  performance  of  phylogeny 
reconstruction  algorithms  on  random  binary  trees.  Recall  the  definitions  of  depth 
and  diameter  from  Section  3. 

Theorem  13.  a)  For  a  random  semilabelled  binary  tree  T  with  n  leaves  under 
the  uniform  model,  d{T)  <  (2  +  o(l))  log2  log2(2n)  with  probability  1  —  o(l), 
and  diam(T)  >  e^/n  with  probability  1  —  O(e^). 
b)  For  a  random  semilabelled  binary  tree  T  with  n  leaves  under  the  Yule-Harding 
distribution,  d{T)  —  O(loglogn)  and  diara[T)  =  0(logn),  with  probability 
1-0(1) 


4.1  Analysis  of  the  Short  Quartet  Method 

In  [6],  Farach  and  Kannan  proposed  a  method  (FK)  for  reconstructing  Cavender- 
Farris  trees  based  upon  applying  the  3-approximation  of  Agarwala  et  al  (dis¬ 
cussed  in  Section  2)  for  the  Loo-nearest  tree  problem  to  corrected  distances. 
They  proved  that  the  method  converged  quickly  for  the  variational  distance  (a 
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related  but  different  concern  than  the  topology  estimation),  but  did  not  analyze 
the  convergence  to  the  topology  of  the  model  tree.  Recently,  Kannan  extended 
the  analysis  (personal  communication)  and  obtained  the  following  counterpart 
to  (1):  If  T  is  a  model  tree  with  mutation  probabilities  in  the  range  [f,g],  and  if 
sequences  of  length  k'  are  generated  on  this  tree,  where 


k'  > 


d  •  log  n 


(2) 


and  d  is  some  constant,  then  with  high  probability  the  result  of  applying  Agar- 
wala  et  al  to  Cavender-Farris  distances  will  be  a  tree  with  the  same  topology  as 
T. 

We  now  compare  the  sequence  length  requirements  for  the  Short  Quartet 
method  as  compared  to  the  3-approximation  algorithm  for  the  nearest  Loo -tree. 
Comparing  this  formula  to  (1),  we  note  that  the  the  comparison  of  depth  and 
diameter  is  the  most  important  issue.  We  always  have  diam{T)  >  2depth{T)  -1-1. 
The  constants  do  not  affect  the  comparison  unless  the  depth  and  the  diameter 
are  close  to  each  other,  which  in  general  they  are  not  (from  our  earlier  results, 
for  almost  all  trees,  the  depth  is  (9(loglogn)  while  the  diameter  is  17(01),  under 
the  uniform  distribution,  while  for  the  Yule-Harding  distribution,  the  depth  is 
still  O (log  log  n)  and  the  diameter  is  J7(logn).  Consequently,  the  Short  Quartet 
Method  requires  much  shorter  sequence  lengths  than  the  Agarwala  et  al  algo¬ 
rithm  for  almost  all  binary  trees. 


We  summarize  these  results  in  the  following  table. 


range  of  mutation  probabilities  on  edges: 

\f,g]  r  1  log  log  nj 

/,  g  are  constants  log  n  ’  log  n 

binary  trees  SQM 

worst-case  FK 

polynomial  polylog 

superpolynomial  superpolynomial 

random  binary  trees  SQM 

(uniform  model)  FK 

polylog  polylog 

superpolynomial  superpolynomial 

random  binary  trees  SQM 

(Yule-Harding)  FK 

polylog  polylog 

polynomial  polylog 

This  comparison  establishes  that  our  method  requires  significantly  shorter 
sequences  in  order  to  ensure  accuracy  of  the  topology  estimation  than  the  algo¬ 
rithm  of  Agarwala  et  al,  for  almost  all  trees  under  both  probability  distributions. 
The  trees  for  which  the  two  methods  need  comparable  length  sequences  are  those 
in  which  the  diameter  and  the  depth  are  as  close  as  possible  -  such  as  complete 
binary  trees.  In  these  cases,  the  previous  analysis  given  in  Section  3  indicates 
that  SQM  will  nevertheless  need  shorter  sequences  than  Agarwala  et  al  will  need 
to  obtain  the  topology  with  high  probability. 

Although  their  running  time  is  likely  to  be  faster  than  ours  on  most  data 
sets,  our  method  is  fast  enough  to  be  useful  for  all  data  sets  that  we  might  wish 
to  analyze  (even  up  to  several  thousand  sequences).  The  real  advantage  of  this 
method  is  its  increase  in  accuracy  on  sequences  of  realistic  length. 
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However,  both  algorithms  are  fast  enough  to  make  real-time  computation  of 
evolutionary  trees  feasible  even  for  very  large  (n  =  500  to  1000)  data  sets.  This 
means  that  the  issue  of  accuracy  realistically  is  the  most  important  issue,  and 
needs  to  be  the  focus  of  the  study. 

5  Lower  bounds 

A  careful  analysis  of  the  table  above  concerning  the  sequence  length  needed  by 
the  short  quartet  method  reveals  that  for  almost  all  trees  under  either  distribu¬ 
tion,  the  required  sequence  length  grows  polylogarithmically  in  the  number  of 
taxa  for  each  fixed  range  of  mutation  probabilities.  In  this  section,  we  show  that 
this  is  a  polynomial  of  the  minimum  possible  sequence  length  for  any  method, 
whether  deterministic  or  randomized. 

We  will  henceforth  assume  that  all  trees  we  consider  are  binary  trees  bi- 
jectively  leaf-labelled  by  the  elements  of  {l,2,...,n}  =  [n];  we  will  call  these 
semi-labelled  binary  trees.  Since  the  number  of  semi-labelled  binary  trees  on  n 
leaves  is  (2n  -  5)!!,  encoding  deterministically  all  such  trees  by  binary  sequences 
at  the  leaves  requires  that  the  sequence  length,  k,  satisfy  {2n  -  5)!!  <  2^^,  i.e. 
k  =  J7(logn).  We  now  show  that  this  information-theoretic  argument  can  be 
extended  for  arbitrary  models  of  evolution  and  arbitrary  deterministic  or  even 
randomized  algorithms  for  tree  reconstruction.  For  each  semi-labelled  binary 
tree,  T,  and  for  each  algorithm  A,  whether  deterministic  or  randomized,  we  will 
assume  that  T  is  equipped  with  a  mechanism  for  generating  sequences,  which 
allows  the  algorithm  A  to  reconstruct  the  topology  of  the  underlying  tree  T  from 
the  shortest  possible  sequences  with  constant  probability. 

Theorem  14.  Let  T  be  a  tree  with  n  leaves  labelled  by  sequences  of  {0, 1}^,  and 
let  A  be  an  arbitrary  algorithm,  deterministic  or  randomized.  For  A  to  be  able 
to  reconstruct  the  topology  of  T  from  the  sequences  at  the  leaves  with  probability 
greater  than  1/2  (respectively  greater  than  e),  it  must  hold  that  (2n  —  5)!!  <  2”^ 
(respectively,  {2n  —  5)!!e  <  2"^^),  and  so  k  =  i7(logn). 

The  Theorem  above  shows  that  model  and  algorithm  have  to  be  a  very  good 
match,  if  not  much  more  than  logn  length  sequences  suffice  for  tree  reconstruc¬ 
tion  with  high  probability  for  each  trees.  In  view  of  the  very  mild  conditions,  it 
is  amazing,  that  this  bound  basically  can  be  attained  by  our  SQM,  applied  to 
the  Cavender- Farris  model! 
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Abstract.  In  this  paper,  we  introduce  a  method  for  proving  universal 
termination  of  constraint  logic  programs  by  strictly  extending  the  ap¬ 
proach  of  Apt  and  Pedreschi  [1].  Taking  into  account  a  generic  constraint 
domain  instead  of  the  standard  Herbrand  univers,  acceptable  (CLP)  pro¬ 
grams  are  defined.  We  prove  correctness  and  completeness  of  the  method 
w  r  t  the  leftmost  selection  rule  for  the  class  of  ideal  constraint  sys¬ 
tems,  including  CLP(7^^,.),  CLP(7^r),  and  Q\.V{TT)  among  the  oth¬ 
ers.  Moreover,  we  investigate  the  problems  arising  in  extending  those 
results  to  non-ideal  constraint  system,  by  specifically  designing  sufficient 
conditions  for  termination  of  CLP(72.)  programs. 


1  Introduction 

Motivations  for  the  termination  analysis  of  logic  programs  are  related  to  sev¬ 
eral  topics,  including  systematic  program  development,  control  generation,  non¬ 
monotonic  reasoning,  decidability  issues,  applications  to  abstract  interpretation, 
program  transformation  and  testing. 

There  are  many  contributions  in  the  literature  on  termination  of  logic  and  Prolog 
programs  (see  [9]  for  a  recent  survey).  However,  research  has  been  mainly  focused 
on  Prolog  programs.  Only  recently  other  logic  programming  (LP)  paradigms 
have  been  considered,  including  logic  programs  with  delay  declarations,  and 
constraint  logic  programming  (CLP). 

Jaffar  and  Maher  claim  in  their  survey  [6],  that  “the  CLP  Scheme  provides  a 
framework  in  which  the  lifting  of  results  from  logic  programming  to  CLP  is  al¬ 
most  trivial”.  As  shown  in  [7],  that  statement  is  certainly  true  for  many  results, 
including  the  equivalence  of  declarative,  functional  and  operational  semanbcs. 
However,  we  will  show  that  a  well-known  declarative  proof  method  for  termina¬ 
tion  of  logic  programs  can  be  easily  extended  only  to  a  restricted  class  of  systems, 
namely  ideal  constraint  systems.  In  those  systems,  the  consistency  test  is  cor¬ 
rect  and  complete,  in  the  sense  that  a  computation  proceeds  iff  the  accumulated 
constraints  are  satisfiable. 

Although  the  class  of  ideal  constraint  systems  includes  CLP(77-^in)j  CLP(77-T), 
RISC-CLP(Real)  and  CLP(>'T)  among  the  others,  several  real  systems  are  not 
ideal.  As  the  most  representative  example,  in  CLP(77.)  [5]  non-linear  constraints 
are  delayed  until  some  variables  in  these  constraints  get  unique  values  during 
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the  further  computation  process  so  that  the  constraints  become  linear.  If  a  com¬ 
putation  stops  with  some  delayed  non-linear  constraints,  the  system  generates 
a  “maybe”  answer,  i.e.  the  test  cannot  ensure  consistency  of  all  the  answer  con¬ 
straints  since  the  test  has  been  performed  only  on  the  linear  ones.  The  delaying 
of  passive  constraints  is  a  mechanism  for  bounding  the  computational  complexity 
of  the  constraint  solver.  Unfortunately,  this  prevents  an  early  failure  detection 
and  may  be  the  cause  of  infinite  derivations. 

In  this  paper,  we  introduce  a  method  for  proving  universal  termination  of  con¬ 
straint  logic  programs  with  respect  to  a  leftmost  selection  rule.  We  extend  the 
approach  of  Apt  and  Pedreschi  [1],  which  declaratively  characterize  the  class  of 
logic  programs  such  that  every  LD-derivation  starting  with  a  ground  query  is 
finite,  namely  acceptable  logic  programs.  On  the  one  hand,  we  lift  their  results 
to  ideal  constraint  systems,  by  taking  into  account  a  generic  constraint  domain 
instead  of  the  standard  Herbrand  univers.  On  the  other  hand,  we  improve  the 
method  by  providing  a  stronger  completeness  theorem  even  in  the  case  of  pure 
logic  programming. 

Concerning  non-ideal  constraint  systems,  we  study  termination  of  CLP(7^)  pro¬ 
grams  by  specifically  designing  two  sufficient  conditions.  Both  of  them  are  aimed 
at  preventing  the  involvement  of  non-linear  constraints  in  the  termination  anal¬ 
ysis,  either  by  removing  them  from  the  analysis,  or  by  imposing  a  notion  of 
well-modedness  which  ensures  that  non-linear  constraints  become  linear  at  run¬ 
time. 


Preliminaries  We  will  use  throughout  the  paper  the  terminology  of  Jaffar  and  Ma¬ 
her  [6].  By  a  program  we  mean  a  constraint  logic  program,  i.e.  a  set  of  clauses  of 
the  form  A  ^  Bi  , . . .  ,  Bn  where  A  is  an  atom  and  each  Bi,i  6  [1,?^],  is  either  an 
atom  or  a  constraint.  A  flat  program  is  a  program  in  which  every  atom  has  the  form 
p{Xi  , . . .  ,  Xn  ),  where  Xi  ,  . . .  ,  Xn  are  (not  necessarily  distinct)  variables. 

A  constraint  domain  is  a  first  order  structure  on  the  signature  X  of  the  con¬ 
straints.  We  denote  with  D  the  domain  of  V.  A  D-interpretation  of  a  program  P  is 
an  interpretation  of  P  with  the  same  domain  as  V  and  the  same  interpretation  for  the 
symbols  in  E  as  V.  It  can  be  represented  as  a  subset  of  where  B^  is  the  set  of 

atoms  of  the  form  p{ai  ,...  ,an  ),  with  ai  E  D  for  i  6  and  p  n-ary  predicate 

symbol  appearing  in  P.  When  P  is  clear  from  the  context,  we  write  B-d.  A  X>-model 
of  P  is  a  ^-interpretation  of  P  which  is  also  a  model  of  it. 

We  write  T>  \=  cd  when  the  constraint  c  is  true  in  P  w.r.t.  the  valuation  Given 
an  atom  p{ii  , . . .  ,in)  and  a  valuation  p{ii  , . . .  stands  for  p{ti^, ....  tn^), 

where  is  the  value  of  <,  in  the  valuation  Analogously  for  queries  and  clauses. 
A  P-ground  instance  of  a  clause  C  is  then  any  C'd,  where  t?  is  a  valuation.  For  a  V- 
interpretation  I  and  a  X>-ground  atom  A,  we  write  /  |=  A  iff  A  €  7.  For  a  P-ground 
constraint  c,  we  write  /  [=  c  iff  7>  [=  c. 

Those  definitions  easily  extend  to  a  many-sorted  language. 

The  operational  semantics  of  a  constraint  system  is  characterized  by  a  transition 
relation  defined  in  terms  of  the  relations  — r,  — ^  c,  —^i,  —^s  and  of  the  functions 
infer  and  consistent,  as  described  in  [6].  infer  is  required  to  satisfy  inf er{C,  S)  = 
(C',S'):=>V\=C  aS^C  A  s'. 
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consistent  is  required  to  satisfy  consisient{C)  t>  \=  3C. 

N  is  the  set  of  natural  numbers.  is  N  U  {cxd}.  The  list-length  function  U  is 
defined  as  follows:  ll{f{ti,--‘  >  ))  is  0  if  /  ^  [.j.]  and  ll{t2)  +  1  if  /(^i  ?  •  •  *  ?  )  — 

[ti\t2].  In  particular,  the  length  of  an  infinite  list  is  oo.  size{t)  is  the  number  of  symbols 
occurring  in  a  term  t.  For  a  pair  {C,S),  we  define  the  projection  on  the  first  element 
(C,.9)i  =  C. 

2  Termination  in  LP 

A  largely  acknowledged  termination  proof  method  for  logic  programs  was  pro¬ 
posed  by  Apt  and  Pedreschi  in  [1],  where  the  class  of  acceptable  logic  programs 
was  introduced.  First  of  all,  we  recall  the  basic  notions  of  level  mappings  and 
ground  instances  of  logic  programs. 

Definition  1.  Given  a  logic  program  P 

~  a  level  mapping  for  P  is  a  function  |  |:  Bp  N  of  ground  atoms  to  natural 
numbers.  |  A  |  is  called  the  level  of  A. 

—  groi(.nd{F)  denotes  the  set  of  ground  instances  of  clauses  from  P.  □ 

Intuitively,  a  program  is  acceptable  if  every  time  a  clause  is  used  in  a  LD- 
derivation,  the  level  of  the  head  of  any  of  its  ground  instances  is  greater  than 
the  level  of  each  atom  in  the  body  which  might  be  selected  further. 

Definition 2.  Let  P  be  a  logic  program,  and  /  C  Pp  a  Herbrand  interpreta¬ 
tion. 

—  P  is  acceptable  by  \  \:  Bp  N  and  /  iff  /  is  a  model  of  P ,  and  for  every 

A  ^  Bi  , . . .  ,  Pn  in  ground{P)  :  for  i  £  [1,  n] 

/  (=  Pi , .  .  . ,  Pf-i  implies  |  A  |  >  |  P«  | 

—  A  query  Q  is  acceptable  by  \  \  and  I  iff  there  exists  k  ^  N  such  that  for  every 

ground  instance  Ai  , . . .  ,  An  of  it:  for  «  G  [1,  ri] 

I  \=z  Ai,  .  . . ,  Ai-i  implies  k  >  \  Ai  \  □ 

We  summarize  the  main  termination  properties  of  acceptable  programs  in  the 
following  Theorem  (see  [1]  for  a  proof). 

Theorems.  Every  LD-derivation  for  a  logic  program  P  and  query  Q  both  ac¬ 
ceptable  by  I  I  and  I  is  finite. 

Conversely,  if  every  LD-derivaiion  for  P  and  Q  and  for  P  and  every  ground 
query  is  finite  then  P  and  Q  are  acceptable  by  some  |  |  and  I.  LI 

Intuitively,  a  generalization  of  acceptability  to  the  CLP  Scheme  has  to  consider 
P-ground  instances  of  clauses,  in  order  to  involve  the  constraint  domain  to  the 
proof  level.  As  an  example,  MEMBER 
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member (X,  [X  I  Xs] ) . 

member (X,  [Y I  Xs] )  ^  member (X,  Xs) . 

and  the  query  Xs  =  [a!  Xs]  ,  member (b,  Xs)  show  different  termination  be¬ 
haviors  when  considering  finite  trees  or  rational  trees  as  the  underlying  constraint 
domain. 

Definition  4.  Given  a  program  P  defined  on  a  constraint  system  V, 

~  a  level  mapping  for  P  is  a  function  |  |  :  Pd  — s-  iV'^  of  P-ground  atoms  to 
natural  numbers  plus  infinitum.  [  A  |  is  called  the  level  of  A. 

-  groundx>{P)  denotes  the  set  of  P-ground  instances  of  clauses  from  P.  □ 


Though  it  is  clear  why  we  consider  now  Pd,  it  is  less  obvious  why  we  include 
oo  in  the  codomain  of  level  mappings.  The  underlying  objective  is  to  be  able 
to  partly  reason  on  termination  of  programs  and  a  restricted  class  of  queries. 
In  the  case  of  MEMBER,  for  instance,  it  is  still  legitimate  to  consider  queries  of 
the  form  member  (2,  i)  where  Hs  a  finite  list,  since  non- termination  arises  only 
for  infinite  lists.  To  this  end,  we  extend  the  >  order  on  natural  numbers  to  the 
relation  >  ,  defined  as  follows: 

n>  m  iff  n  =  oo  or  n>m 

Therefore,  oo>  a  for  every  a  G  and  for  n  E  n>  m  iff  m  €  iV”  and 

n  >  m.  It  is  worth  noting  that  although  >  is  not  an  ordering  relation,  there  is 
no  infinite  descending  chain  ni>  n2>  ...  when  ni  E  N . 

3  From  LP  to  ideal  CLP 

Acceptability  extends  to  constraint  logic  programs  by  replacing  the  Herbrand 
uni  vers  with  the  constraint  domain,  and  the  ordering  >  with  the  relation  >  . 

Definitions.  Let  P  be  a  program  on  the  constraint  system  P,  /  C  Pd  a  P- 
interpretation  and  |  |  a  level  mapping  for  P. 

-  P  is  acceptable  by  \  \  and  /  iff  7  is  a  P-model  of  P,  and  for  every 

A  ^  Pi  , . . .  ,  P„  in  groundx>{P):  for  z  6  [1,  n],  if  Bi  is  an  atom  then 

I  Pi , . . . ,  Pi-1  implies  \  A  \  >  |  Pi  | 

-  A  query  Q  is  acceptable  by  |  |  and  I  iff  there  exists  k  E  N  such  that  for  every 
P-ground  instance  Ai  , .  .  .  ,  An  of  it:  for  i  G  [1,  n],  if  Ai  is  an  atom  then 

7  1=  Ai, . .  . ,  Ai_i  implies  ^>|Ai| 
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The  definition  above  is  quite  similar  to  Definition  2,  except  for  the  fact  that  now 
we  consider  atoms  whose  level  is  infinitum,  and  do  not  require  the  decreasing  of 
tlie  level  mapping  from  the  head  of  a  7^-ground  clause  to  the  constraints  in  the 
body.  The  latter  choice  is  only  a  matter  of  convenience,  since  the  most  natural 
level  of  a  constraint  should  be  always  0. 

Relation  >  plays  two  roles.  On  the  one  hand,  it  prevents  us  from  reasoning  about 
badly-iyped  clauses,  i.e.  those  for  which  the  level  of  the  head  is  infinitum.  In  fact, 
if  the  level  |T|  of  the  head  of  a  T>-ground  clause  is  infinitum,  the  requirement 
\A\>  \Bi  \  in  Definition  5  is  trivially  satisfied  for  every  i.  On  the  other  hand,  > 
plays  the  same  role  of  the  >  order  on  naturals  when  the  level  of  the  head  is 
finite. 

We  recall  from  [6]  the  definition  of  ideal  constraint  systems.  We  denote  with 
{C,S)  a  pair  of  sets  of  active  and  passive  (i.e.,  delayed)  constraints. 

Definition  6.  A  constraint  system  with  operational  semantics  defined  by  — ^  , 
consistent  and  infer  is  called  ideal  if 

(i)  ^  ris  T  cis  i 

(ii)  for  every  (C,  5),  infer{C,  5)  =  (C  U  5,  0) 

(Hi)  for  every  C,  consist ent{C)  \=  3(7.  ^ 

Therefore,  the  operational  semantics  of  ideal  constraint  systems  is  defined  in 
terms  of  ^  ris  >  cis  transitions,  the  inferred  active  constraint  set  (7  U  5 

gathers  all  the  information  of  the  pair  {C,S),  and  the  consistency  test  is  com- 
plete.  CLPiUiin),  CLP(7er),CLP(:?^T),  RISC-CLP(7^)  fall  in  this  class.  On  the 
contrary,  full  CLP (7^)  [5]  is  not  ideal,  since  non-linear  constraints  are  delayed 
until  they  become  linear. 

As  an  example,  let  us  consider  the  clp(7^T)  (alias  Prolog  without  occur  check) 
program  CURRY,  which  implements  the  rules  of  a  simple  Curry’s  type  system. 
The  query  type (T’,M,T)  is  intended  to  calculate  the  type  T  of  a  term  M  in 
the  environment  E.  Since  the  elements  of  the  domain  are  rational  trees,  recursive 
polymorphic  types  are  allowed,  such  as  the  solution  of  the  equation  a  =■  ex  >-/7. 
The  answer  constraint  for  the  query 

type(D,  lambda(x,  apply(var(x),  var(x))),  T).  (1) 

binds  T  to  the  type  a. 

type(E, var (X) ,T)  ^  in(E,X,T). 

type(E,apply(M,N) ,T)  ^  type(E, M , arrow(S ,T) ) ,  type (E , N , S) . 

type(E,lambda(X,M),arrow(S,T))  ^  type(C(X,S) lE] ,M,T) . 

in([(X,T)|E],X,T). 

in([(Y,Tl)|E],X,T)  ^X  7^  Y,  in(E,X,T). 

CURRY  and  the  query  (1)  are  both  acceptable  by  |  |  and  Bur,  where 
[typeCT*,  M,  T)\  =  ll{E)  A  size{M) 
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X,  T)\  =  UiE) 

On  the  other  hand  CURRY  and  a  query  such  as 

M  -  lambda(x,M),  type([|,  lambda(x, M), T) 

are  not  acceptable  by  a  same  level  mapping  and  interpretation.  In  fact,  they 
have  an  infinite  LD-tree.  In  general,  CURRY  and  a  query  type(^,  M ,  T)  may 
not  terminate  when  M  is  an  infinite  term.  The  use  of  oo  in  the  codomain  of 
level  mappings  covers  the  situations  in  which  we  are  interested  to  reason  on 
termination  of  a  restricted  class  of  queries.  As  another  example,  consider  the 
well-known  test  &  generate  programming  technique: 

program(X,  Y)  ^test(X,  Y),  generate(X,  Y). 

test  creates  a  network  of  constraints  between  the  variables,  whilst  generate 
instantiates  the  variables.  When  reasoning  on  termination,  we  have  to  show  the 
decreasing  of  the  level  mapping  from  the  head  to  the  generate  atom  in  the  body 
only  for  those  X>-ground  instances  that  pass  the  constraint  network.  Thus,  we 
should  not  be  worried  about  the  possible  divergence  arising  for  generate  atoms 
that  do  not  satisfy  the  test  constraints. 

The  following  theorem  states  termination  of  acceptable  programs  and  queries. 
It  extends  the  first  part  of  Theorem  3  to  ideal  constraint  systems. 

Theorem?.  (Termination  Correctness)  Consider  an  ideal  constraint  system, 
and  a  program  P  and  a  query  Q  both  acceptable  by  j  |  and  /.  Then  every  LD- 
derivation  for  P  and  Q  is  finite.  ^ 


Consider  again  CURRY.  By  the  theorem,  we  conclude  that  the  LD-tree  of  the 
query  ( 1)  is  finite. 

Focusing  on  termination  completeness,  we  present  a  result  that  extends  the 
second  part  of  Theorem  3.  It  is  even  more  general,  since  we  relax  the  hypothesis 
that  the  LD-tree  of  the  program  and  every  ground  query  is  finite.  In  other  words, 
our  notion  of  acceptability  is  a  correct  and  complete  characterization  of  universal 
termination  with  respect  to  leftmost  selection  rules. 

Theoi'emS.  (Termination  Completeness)  Consider  an  ideal  constraint  system, 
a  program  P  and  a  query  Q  such  that  every  LD-derivation  for  P  and  Q  is  finite. 
Then  there  exist  \  |  and  I  such  that  P  and  Q  are  both  acceptable  by  \  \  and  I.  □ 

4  Prom  ideal  CLP  to  CLP(7^) 


Let  us  consider  now  the  following  program  FACT  for  computing  factorial  numbers: 
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fact (0 ,  1)  . 
fact(l,  1). 

fact(N,  N  *  F)  >=  1,  N  >=  2,  fact(N-l,  F). 

A  query  such  as  fact(4,F)  is  intended  to  compute  the  4^^  factorial  number, 
i.e.  24.  Moreover,  the  same  program  can  be  used  to  check  whether  a  number  is 
factorial,  by  means  of  a  query  such  as  Q  =  fact(N,  24).  We  point  out  that 
FACT  and  Q  are  both  acceptable  by  |  |  and  Bn  where 

|fact(n,  f)  \  =  int{f) 

where  int(f)  is  the  integer  part  of  a  real  /.  From  Definition  5,  the  only  proof 
obligation  we  have  to  show  is  that 

int(n  ■  f)  >  int(f) 


when  /  >  1,  n  >  2. 

Running  the  program  and  the  query  Q  on  a  RISC-CLP(Real)  system,  the  result¬ 
ing  LD-tree  is  finite.  In  fact,  as  RISC-CLP(Real)  is  ideal,  termination  is  a  conse¬ 
quence  of  Theorem  7.  On  the  contrary,  the  LD-tree  built  by  the  CLP(7^)  system 
is  infinite,  since  the  system  eventually  runs  into  an  infinite  loop  by  applying  the 
third  clause  again  and  again.  As  CLP(7^)  delays  the  non-linear  constraints,  their 
unsatisfiability  is  never  checked. 

As  often  it  happens,  real  programming  language  implementations  deviate  from 
theoretically  desirable  properties.  They  often  sacrifice  completeness  of  the  con¬ 
sistency  test  for  efficiency  reasons.  The  consistency  test  on  passive  constraints 
is  delayed  until  they  are  sufficiently  instantiated.  This  is  the  case,  for  example, 
of  non-linear  constraints  in  CLP(7^).  As  a  consequence,  the  computation  may 
proceed  even  in  the  case  that  the  accumulated  constraints  are  unsatisfiable. 

A  simple  extension  of  our  approach  to  generic  systems  is  then  to  prevent  the  use 
of  any  declarative  reading  of  programs  in  the  termination  proofs. 

Definition  9.  A  program  P  is  recurrent  by  |  |  iff  for  every  A  Bi  , . . .  Bn  in 
groundv{P)-  for  2  G  [1,  n],  if  Bi  is  an  atom  then  \  A  \  >  \  Bi  \  .  □ 

The  definition  of  recurrent  queries  is  derived  accordingly.  It  can  be  easily  shown 
that  any  derivation  is  finite  with  respect  to  any  selection  rule,  when  considering 
programs  and  queries  both  recurrent  by  a  same  level  mapping.  Recurrent  pro¬ 
grams  extends  recurrent  logic  programs  introduced  by  Bezem  [2].  As  an  example, 
consider  the  program  MAP,  defined  in  CLP(7^). 

mapCD,  []). 

map([X|Xs],  CYlYs])  ^  Y  =  X  *  X,  map(Xs,  Ys)  . 

It  is  easy  to  see  that  it  is  recurrent  by  defining  |map(I/S,  Rs)\  =  ll{Ls).  However, 
if  we  rewrite  MAP  in  a  flat  form,  namely  the  following  MAPFLAT 
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niap(A,  B)  ^  A  =  []  ,  B  =  []  . 

itiapCA,  B)  ^  A  =  [XlXs],  B  =  [Y|Ys],  Y  =  X  *  X,  map(Xs,  Ys). 
we  obtain  a  program  that  is  not  recurrent. 

In  the  rest  of  this  section,  we  give  some  sufficient  conditions  specially  designed 
for  termination  of  CLP(7^)  programs  together  with  a  generalization  of  the  un¬ 
derlying  insights  to  other  non-ideal  constraint  systems. 

A  first  idea  is  to  exclude  non-linear  constraints  from  the  termination  analysis. 
Next  theorem  states  that  if  a  program  and  a  query  with  their  non-linear  con¬ 
straints  removed  terminate,  then  the  original  program  and  query  do  terminate. 

Theorem  10.  Consider  the  CLP('R.)  system.  The  LD-tree  for  a  program  P  and 
a  query  Q  is  finite  if  P'  and  Q'  are  both  acceptable  by  |  |  and  /,  where  P'  (resp., 
Q' )  is  obtained  by  deleting  all  non-linear  constraints  from  P  (resp.,  Q).  □ 

Intuitively,  the  conclusion  follows  since  adding  constraints  to  a  clause  implies 
having  shorter  derivations. 

Consider  again  the  MAPFLAT  program.  It  is  immediate  to  observe  that  the  non¬ 
linear  constraint  Y  =  X  *  X  does  not  play  a  relevant  role  in  termination  of  a 
query  such  as  map ([X, 3, 5]  ,Z).  In  fact,  termination  is  given  by  the  decreasing 
of  the  length  of  the  list  in  the  first  argument  of  map.  By  deleting  Y  =  X  *  X  we 
get  the  program  MAPFLAT  ’ 

map(A,  B)  A  =  []  ,  B  =  []  . 

map(A,  B)  ^  A  =  [X|Xs],  B  =  [YlYs],  map(Xs,  Ys). 

which  is  acceptable  by  |  |  and  Bn,  where  |  map(Ls,  Rs)  \  =  ll{Ls).  Therefore, 
we  conclude  that  the  LD-tree  for  MAPFLAT  and  map(  CX,3,5]  ,Z)  is  finite. 

In  general,  we  have  a  stronger  result  for  a  large  class  of  constraint  systems. 

Definition  11.  A  constraint  system  with  operational  semantics  defined  by  — ^  , 
consistent  and  infer  is  called  incremental  if  — >  =  ris  +  -^ds,  and 

[Mjfor  every  S'  C  S,  consistent{infer{(l},S)i)  consistent(inf  er{(l^ ,  S')i) 

[I]  for  S,  S'  sets  of  constraints,  and  C  set  of  active  constraints 

consistent{infer{infer{C,  S)  U  (0,  S'))i)  4^  consistent{in f  er{C ,  S  U  )  □ 


As  an  example,  ideal  constraint  systems  and  CLP(7^)  are  incremental.  Basi¬ 
cally,  [M]  requires  monotonicity  of  consistent  and  infer  ~  a  condition  naturally 
satisfied  in  all  practical  systems. 

[I]  is  an  incrementality  requirement.  Starting  from  a  pair  (C,S),  if  applying 
infer  first,  then  adding  the  constraints  in  S'  and  then  re-applying  infer  we 
obtain  a  consistent  state,  then  the  state  obtained  by  applying  infer  only  once 
to  (C,  S  U  S')  should  be  consistent  as  well,  and  vice-versa. 
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Theorem  12.  Consider  an  incremental  constraint  system.  The  LD-tree  for  a 
program  P  and  a  query  Q  is  finite  if  the  LD-tree  for  P  and  Q  is  finite,  luhere 
P'  (resp.,  Q')  IS  obtained  by  deleting  some  constraints  from  P  (resp.,  Q).  □ 

However,  this  approach  is  not  sufficient  to  prove  termination  when  it  depends  on 
non-linear  constraints.  Consider  the  program  SQRT  for  computing  square  roots 
of  naturals. 

srqt(X,  R)  ^  A  =  0,  sqrt2(X,  A,  R) . 
sqrt2(X,  A,  A)  ^  (A+1)*(A+1)  >  X. 

sqrt2(X,  A,  B)  --  (A+1)*(A+1)  <  X,  A1  =  A  +  1,  sqrt2(X,  Al,  B)  . 

If  we  remove  the  non-linear  constraints,  we  get  a  program  that  has  an  infinite 
LD-derivation  for  any  query  by  applying  the  third  rule  again  and  again.  In 
addition,  the  non-linear  constraints  become  linear  at  run-time  iff  sqrt2  is  called 
with  the  second  argument  ground. 

To  properly  reason  on  programs  containing  non-linear  constraints  that  become 
linear  at  run-time,  we  introduce  a  notion  of  moding.  Without  any  loss  of  genei- 
alit.y,  we  restrict  to  consider  flat  programs. 

Definition  13. 

—  Consider  an  n-ary  predicate  symbol  p.  A  mode  for  p  is  a  function  dp  from 
{1, . .  . ,  n}  in  {-h,  -,  If  dp(i)  ='  +'  we  call  i  an  input  position.  If  dp{i) 

— '  then  i  is  called  an  output  position.  If  dp{i)  —  f  then  i  is  called  a  blank 
position  (with  respect  to  dpf)  We  write  dp  in  the  form  p{dp{l),  .  .  .,dp{n)). 

—  A  mode  for  a  constraint  c{Xi  , . . .  ,Xn)  whose  variables  are  Xi  , .  .  .  ,  X„, 
is  a  function  dp  from  {Xi  , .  . .  ,  Xn  }  in  {-f ,  — ,  j^}.  We  write  dp  in  the  form 
c(  A 1  dp  ( 1 ) , . . . ,  Xn  dp  (n)) . 

—  For  an  atom  or  a  constraint  A,  we  write  Y ,  Z)  to  denote  that  X  are 

the  variables  occurring  in  input  positions,  Y  are  those  occurring  in  output 
positions,  and  Z  are  those  occurring  in  blank  positions. 

—  We  say  that  a  flat  program  P  is  well-moded  iff  for  every  clause 

74o(Yo,  Xn+i,  Zo)  <—  A.i(Xi ,  Yi ,  Zi), . . . ,  y4n(Xn ,  Y„,  Z„) 

ofP,  for  i  G  [l,n-h  1]  XiCUkKi^k- 

—  We  say  that  ei  flat  query  Ai(Xi,  Yi,  Zi), . . . ,  An(Xn,  Y^,  Z„)  is  well-moded 

iff  for  i  e  [1, 77]  Xi  C  U  k<i^k-  ^ 

The  intuition  underlying  this  definition  is  to  force  the  input  variables  in  an  atom 
or  a  constraint  selected  along  a  LD-derivation  to  be  grounded  by  the  active 
constraints.  Variables  not  involved  in  the  input-output  relation  are  marked  as 
blank. 

Suppose  now  that  the  moding  of  the  constraints  is  consistent  with  the  operational 
semantics,  i.e.  if  a  constraint  c(X,Y,Z)  is  selected  and  the  active  constraints 
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imply  X  =  a  for  some  tuple  a  of  elements  of  the  domain,  then  the  active  con¬ 
straints  of  the  resolvent  (if  exists)  imply  Y  =  b  for  some  tuple  b.  Under  this 
assumption,  when  a  non-linear  constraint  is  selected  then  the  input  variables 
are  grounded  by  the  active  constraints.  We  can  exploit  this  fact  to  impose  that 
non-linear  constraints  become  linear  at  run-time. 

Definition  14.  A  moding  for  a  program  P  (resp.,  a  query  Q)  is  consistent  w.r.t. 
CXP(7v.)  if  for  every  constraint  c(X,  Y,  Z)  ih  P  (resp.,  Q)  either 

(i)  Y  is  an  empty  tuple  and  c(X,  Y,  Z)  is  linear  in  Z,  or 

(it)  Y  is  a  tuple  of  only  one  variable,  Z  is  an  empty  tuple  and  c(X,  Y,  Z)  is  an 
equation  linear  in  Y.  ^ 

It  is  worth  noting  that  both  well-modedness  and  consistency  w.r.t.  CLP(7?.)  are 
syntactic  notions.  Consider  again  the  program  SQRT.  It  is  immediate  to  see  that 
it  is  well-moded  with  the  moding 

sqrt()l,  j^)  y  sqrt2()l,  +,  A-  =  0 

(A-b+l)*(A++l)  >  (A-b+l)*(A++l)  <  Al-  =  A+  +  1. 

Moreover,  the  moding  for  the  constraints  is  consistent  w.r.t.  CLP(7?.).  Next  the¬ 
orem  relates  modings,  acceptability  and  termination  by  providing  a  sufficient 
condition  for  termination  of  well-moded  acceptable  CLP(7^)  programs. 

Theorem  15.  Consider  the  CLP(1Z)  system.  Lei  P  and  Q  be  well-moded  flat 
program  and  query  and  let  the  moding  be  consistent  w.r.t.  CLPfJZ).  Suppose  P 
and  Q  are  both  acceptable  by  I  and  \  \.  Then  every  LD-derzvaiion  for  P  and  Q 
IS  finite.  ^ 


The  program  SQRT  and  the  query  sqrt(n,  R)  for  n  £  N  are  acceptable  by  B-ji 
and  I  I,  where 


|sqrt2(a:,  a, 


max[x  — 
oo 


a,0) 


li  x,a  ^  N 
otherwise 


|sqrt(a:,  r)  |  = 


X  -f  1  if  X  G  A" 
oo  otherwise 


Therefore,  Theorem  15  allows  us  to  state  that  the  LD-tree  for  SQRT  and  sqrt(?r , 
R)  is  finite  when  n  E  N . 


Theorem  15  can  be  used  together  with  Theorem  12  in  order  to  prove  termination 
of  programs  P  and  queries  Q  defined  on  CLP(7^),  by  means  of  the  following 
strategy: 

(t)  delete  some  (jion-linear)  constraints  from  P  and  Q,  and 
(it)  show  that  the  resulting  program  and  query  are  well-moded  and  acceptable  by 
the  sam.e  model  and  level  mapping. 


Finally,  we  point  out  that  this  approach  is  extendible  to  a  generic  non-ideal 
constraint  system  by  appropriately  defining  a  notion  of  cozisistency  of  constraint 
moding  w.r.t.  the  system. 


848 


5  Conclusions 

There  is  still  little  work  on  the  extension  of  termination  approaches  to  constraint 
logic  programming.  The  only  papers  we  are  aware  of  are  [3]  and  [8].  [8]  provides 
sufficient  conditions  based  on  approximation  techniques,  with  the  aim  of  autom¬ 
atizing  the  termination  proof.  [3]  presents  a  necessary  and  sufficient  condition  for 
termination  based  on  a  radically  different  approach  from  ours,  which  is  inspired 
by  the  works  of  Floyd  on  termination  of  flowchart  programs.  We  also  cite  [4], 
where  a  class  of  programs  is  characterized  with  no  delayed  constraints  at  the 
end  of  successful  computations.  Also,  that  method  is  able  to  discover  possible 
sources  of  non- termination  due  to  delaying  of  non-linear  constraints. 

We  presented  an  extension  to  the  CLP  Scheme  of  a  largely  acknowledged  ap¬ 
proach  to  termination  of  logic  programs.  For  a  large  class  of  constraint  systems, 
namely  ideal  constraint  systems,  we  extend  and  improve  on  the  results  of  [1], 
showing  stronger  forms  of  correctness  and  completeness  even  in  the  case  of  pure 
logic  programming.  In  the  second  part  of  the  paper,  we  investigated  termination 
specifically  for  the  CLP(7^)  system,  by  proposing  two  sufficient  conditions. 
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Abstract.  This  paper  investigates  the  expressive  power  of  DATALOG”* 
queries  under  unique  T-stable  model  semantics,  i.e.,  a  query  on  a  given 
database  yields  an  answer  if  and  only  if  there  exists  a  unique  T-stable 
model.  Under  this  semantics  DATALOG"'  queries  are  shown  to  express  ex¬ 
actly  all  decision  problems  with  unique  solutions.  Obviously,  unique  T- 
stable  model  semantics  is  the  ’natural’  semantics  for  queries  with  at 
most  one  T-stable  model  or  with  exactly  one  T-stable  model  for  every 
database.  The  expressive  powers  of  of  these  two  classes  of  queries  are 
investigated  as  well  but  it  turns  out  that  any  practical  language  for  such 
queries  cannot  get  to  an  expressive  power  higher  than  DATALOG  with 
stratified  negation. 


1  Introduction 

Total  stable  models  {T-stable  models)  [9]  provide  a  simple,  yet  powerful  se¬ 
mantics  to  DATALOG”’,  i.e.,  logic  programming  with  negation  but  without  function 
symbols.  One  of  the  properties  of  stable  models  is  their  multiplicity:  a  program 
may  have  from  0  to  n  T-stable  models,  where  n  can  grows  exponentially  with 
the  size  of  the  universe. 

Multiplicity  has  been  recognized  by  some  authors  as  an  important  opportu¬ 
nity  for  either  expressing  non-determinism  or  for  increasing  the  expressive  power 
while  preserving  determinism  (e.g.,  by  taking  the  union  or  the  intersection  of  all 
models).  On  the  other  hand,  multiplicity  has  been  strongly  criticized  by  many 
other  authors  mainly  because  the  canonical  meaning  of  a  logic  program  is  tradi¬ 
tionally  based  on  a  unique  model.  This  criticism  explains  the  great  deal  of  inter¬ 
est  for  special  classes  of  DATALOG”'  programs  with  ‘unique’  T-stable  models  such 
as  stratified  model  [3]  or  total  well-founded  model  [23],  notwithstanding  their 
reduced  expressive  power  (indeed,  only  a  proper  subset  of  polynomial  problems 
are  expressible  by  such  programs). 

An  interesting  question  is  the  following:  is  there  any  class  of  DATALOG^  queries 
which  preserves  the  T-stable  model  uniqueness  property  but  it  has  an  expressive 
power  higher  than  stratified  DATALOG"’?  To  anwser  this  question,  we  investigate 
the  classes  Qo,i  and  Qi  of  DATALOG”"  queries  admitting,  respectively,  at  most 
one  T-stable  model  and  exactly  one  T-stable  model  for  every  input  database. 
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We  show  that  the  expressive  powers  of  these  two  classes  is  bound  by  MV  H  ZYW 
and  MV  n  coMV  respectively.  But  MV  ^U\N  and  MV  O  coMV  as  well  as  all 
their  meaningful  subclasses  from  V  over  are  not  know  to  be  expressible  by  a 
(recursively  enumerable)  query  language  [13].  Moreover,  although  total  welh 
founded  semantics  is  capable  to  express  all  fixpoint  queries  as  recently  shown  m 
[8],  no  language  is  known  which  has  the  same  power  of  fixpoint  queries  and  only 
generates  queries  having  total  well-founded  models  for  every  database.  Thus,  it 
appears  that  any  practical  language  for  queries  with  unique  T-stable  model  is 
not  more  expressive  than  stratified  DATAL0G“'! 

To  get  more  expressive  power  using  a  semantics  based  on  a  unique  total  stable 
model,  it  probably  remains  to  take  the  whole  class  of  DATALOG"  queries  and  to 
check  uniqueness  a-posteriori.  To  this  end,  we  introduce  the  unique  T-stable 
model  semantics:  a  ground  literal  is  true  if  both  it  is  in  a  T-stable  model  and 
there  exists  no  other  T-stable  model  —  informally  multiplicity  corresponds  to  a 
negative  answer.  We  show  that  the  class  of  all  DATALOG^  queries  under  unique 
T-stable  model  semantics  is  able  to  express  all  the  decision  problems  that  can 
be  defined  using  an  existential  second-order  formula  of  the  form  (3!S)^(S)  with 
unique  witnesses  for  the  second-order  quantifiers,  i.e.,  there  are  unique  relations 
Si , . . . ,  Sm.  in  S  satisfying  the  first-order  formula  <^(S)  —  we  call  this  class  UW. 
This  is  an  interesting  class  which  consists  of  most  of  all  decision  problems  with 
unique  solutions.  Observe  that  T-stable  models  under  a  popular  version  of  T- 
stable  model  semantics,  certain  semantics,  capture  coMP',  so,  as  coMV  C  UW , 
unique  T-stable  model  semantics  turns  out  to  be  more  expressive  than  certain 
semantics. 

The  paper  is  organized  as  follows.  Background  and  basic  definitions  on  T- 
stable  model  semantics  for  DATALOG"  queries  are  given  in  Section  2.  The  ex¬ 
pressive  power  of  unique  T-stable  model  semantics  for  the  class  of  all  DATALOG 
queries  is  investigated  in  Section  3.  The  analysis  of  the  subclasses  Qo,i  and  Qi 
as  well  as  the  conclusion  are  presented  in  Section  4. 


2  Total  Stable  Models  and  DATALOG  Queries 

Let  us  start  by  recalling  basic  concepts  and  notation  of  the  DATALOG"’  lan¬ 
guage,  that  is  logic  programming  with  negative  goals  in  the  rules  but  without 
function  symbols  [1,  21]. 

A  rule  r  is  a  formula  of  the  language  of  the  form  Q  <  Qij-'jQmo  where  Q 
is  a  atom  {head  of  the  rule)  and  Qi,  are  literals  {goals  of  the  rule).  A 

ground  rule  with  no  goals  is  called  a  fact;  a  rule  without  negative  goals  is  called 
positive.  A  DATALOG"  program  is  a  finite  set  of  function-free  rules  and  it  is  called 
positive  (or,  simply,  DATALOG)  when  all  its  rules  are  positive. 

Given  a  DATALOG"  program  CP,  some  of  the  predicate  symbols  {EDB  pred¬ 
icates)  do  not  occur  in  the  rule  heads  as  they  are  defined  by  a  number  of  facts 
stored  into  a  database  —  the  other  predicate  symbols  are  called  IDB  predicates. 
EDB  predicate  symbols  form  a  relational  database  scheme  VScp,  thus  they  are 
also  seen  as  relation  symbols.  A  database  D  on  VScp  is  a  set  of  finite  relations 
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D(r)  on  a  countable  domain  17,  one  for  each  r  in  VScp-  Given  a  database  D  on 
VSjcp,  CPd  denotes  the  program  obtained  from  CP  by  adding  the  facts  corre¬ 
sponding  to  the  relation  tuples  in  D.  Observe  that  the  Herbrand  universe  and 
the  Herbrand  Base  for  CPd  (denoted  by  Ucp,^  and  respectively)  are  both 

finite;  moreover,  Ucp^  is  a  finite  subset  of  U  as  possible  constants  in  CP  are  also 
taken  from  the  domain  U .  Any  subset  BcPo  is  called  an  interpretation. 

Let  M  be  an  interpretation  of  the  program  CPd^  Let  pos{CpD,M)  be  the 
positive  program  obtained  from  the  ground  instantiation  of  CPd  by  deleting  (a) 
each  rule  that  has  a  negative  goal  -^A  for  which  A  E  M ,  and  (b)  all  negative 
goals  from  the  remaining  rules.  Then  M  is  total  stable  (T-stable)  model  [9]  if  and 
only  if  M)(^)  ~  where  the  operator  T  is  the  classical  immediate 

consequence  transformation.  The  existence  of  a  T-stable  model  for  any  program 
is  not  guaranteed. 

Fact  1  [9,  17]  Given  a  DATALOG"'  program  CP,  a  database  D  on  VScp,  an 
interpretation  M  for  CPd,  then 

1.  deciding  whether  M  is  a  T-stable  model  for  CPd  *<5  in  V  ; 

2.  deciding  whether  there  exists  a  T-stable  model  for  CPd  is  NP- complete.  □ 

Three  versions  of  deterministic  semantics  for  T-stable  models  are  known  in 
the  literature:  the  possible  (or  credulous  oi  brave)  semantics  [2,  20,  6],  the  certain 
(or  skeptical  or  cautious)  semantics  [9,  2,  20,  6],  and  the  definite  semantics  [19]. 
We  now  introduce  a  fourth  version:  the  unique  T-stable  model  semantics. 

Definition!.  Given  a  DATALOG""  program  CP,  a  database  D  on  VScp  and  a 
ground  literal  A,  then 

1.  A  is  a  TS^  (possible)  inference  of  CPd  if  A  is  true  in  some  T-stable  model 
of  CPd; 

2.  A  is  a  (certain)  inference  of  CP d  if  A  is  true  in  each  of  the  T-stable 
models  of  CPd  ; 

3.  A  is  a  TS^'  (definite)  inference  of  CP  if  CPd  ba  at  least  one  T-stable  model 
and  A  is  in  each  of  these  models; 

4.  A  is  a  TS^  (unique)  inference  of  CPd  if  CPd  ha  exactly  one  T-stable  model 

and  A  is  true  in  this  model.  Q 

The  above  version  of  T-stable  model  semantics  will  be  denoted  by  TS'^ ,  where 
V  is  3,  V,  V!,  or  1. 

Definition2.  A  (bound  DATALOG"")  query  Q  is  a  pair  (CP,G),  where  CP  is  a 
DATALOG^  program  and  G  is  a  ground  literal  (the  query  goal)  —  possible  con¬ 
stants  in  G  are  in  U  as  well.  The  set  of  all  queries  is  denoted  by  Q. 

Given  any  T-stable  model  semantics  TS'^ ,  the  database  set  of  Q  under  TS^ , 
denoted  by  TXPrs^(Q),  is  the  set  of  all  databases  D  on  VScp  for  which  G  is  a 
TS^  inference  of  CPd  -  Moreover,  the  expressive  power  of  the  TS^  semantics  is 
measured  by  the  family  of  the  database  sets  of  all  possible  queries  and  is  denoted 
by  EAPrs^lQ]  =  {SXPrs^(Q)\Q  e  Q}.  □ 
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It  is  well  known  that  for  each  query  Q  and  for  each  T-stable  model  semantics 
7S''\  SXPrs'^iQ)  is  indeed  a  generic  database  set  [5,  1],  i.e.,  it  is  closed  under 
renaming  of  constants  in  (C/  — C),  where  C  is  the  set  of  constants  occurring  in  £P 
and  in  G  —  thus  the  constants  not  in  C  are  not  interpreted  and  relationships 
among  them  are  only  those  explicitly  provided  by  the  databases.  From  now 
on  any  generic  set  of  databases  on  the  same  scheme  will  be  called  a  database 
collection. 

The  expressive  power  of  any  T-stable  model  semantics  will  be  measured  w.r.t. 
classes  of  database  collections  defined  as  follows.  Given  a  (not  necessarily  Turing 
machine)  complexity  class  C  of  decision  problems  and  a  database  collection  D,  F) 
is  C -recognizable  if  the  problem  of  deciding  whether  a  database  Z)  is  in  D  is  in  C . 
The  database  complexity  class  DB-C  is  the  family  of  all  (7-recognizable  database 
collections  —  for  instance,  DB-V  is  the  family  of  all  database  collections  that  are 
recognizable  in  polynomial  time.  Observe  that  any  two  database  collections  in  a 
database  complexity  class  do  not  in  general  share  the  same  database  scheme. 

We  stress  that  our  expressive  power  measure  follows  the  data  complexity 
approach  of  [5,  24]  for  which  the  query  is  assumed  to  be  a  constant  whereas  the 
database  is  the  input  variable.  The  following  results  are  known  in  the  literature: 

Fact  2  Given  a  DATALOG"*  program  CP,  a  database  D  on  VScp,  and  an  inter¬ 
pretation  M  for  JCPd,  then 
L  SXPrs^  =  DB-MV  [17]; 

2.  EXPrs^  =  DB-coMV  [20]; 

S.  EPCPrs^^  =  DB-VP  [19].  ° 

Example  1.  Let  VSk  =  {vy  e}  be  a  database  scheme  defining  directed  graphs  and 
be  the  set  of  all  databases  on  VSk  corresponding  to  graphs  with  a  kernel 
—  recall  that  a  kernel  of  a  graph  G  is  a  subset  Vi  of  F  such  that  (a)  for  any  two 
x,y  eVu  the  edge  {x,  y)  is  not  in  E,  and  (6)  for  a.ny  y  £  V2  =  V  -  Vi  there  is 
an  X  e  Vi  such  that  (x,y)  £  E.  Consider  the  following  DATALOG"  program  K\ 

vl(X)  ^v(X),  -nv2(X). 

v2(X)  <-v(X),  -ivl(X). 

joined_to_Vl(X)  ^  vl(Y),  e(Y,X). 

no_condition_a  vl(X),  joined_to_Vl(X). 

no_condition_b  ^  v2(X),  -ijoined_to_Vl(X). 

kernel  ^  -ino_conditionja,  -ino_conditionJb. 

T_constraint  ^  -tkernel,  -iT_constraint. 

Given  any  database  D  on  VSKy  say  corresponding  to  the  graph  G,  any  possible 
T-stable  model  M  of  Kj)  must  make  T ^constraint  false  because  of  the  last  rule 
(otherwise,  T -constraint  would  be  undefined)  j  then  AZ  must  make  true  kerne f 
i.e.,  the  vertices  selected  for  Vi  by  M  through  the  first  rule  form  a  kernel  for 
G.  Hence,  Kd  has  exactly  one  T-stable  model  for  each  kernel  of  the  graph. 
Given  the  query  =  {K^kernel),  EXPq-^3{Q^)  =  EXP-jns^-iQ^)  = 
under  both  possible  and  definite  T-stable  model  semantics  defines  the  NP- 
complete  problem  of  whether  a  graph  has  a  kernel.  Moreover,  EXPrs^{Q  )  — 


853 


that  is  the  set  of  all  graphs  with  exactly  one  kernel;  i.e.,  under  unique  T- 
sta.ble  model  semantics,  defines  the  problem  of  whether  a  graph  has  exactly 
one  kernel.  On  the  other  hand,  as  kernel  is  a  I'S'^  inference  also  when  there  is 
no  T-stable  model,  the  database  set  of  under  semantics  consists  of  all 
graphs,  i.e.,  the  query  is  meaningless  under  this  semantics. 

Let  K'  be  obtained  from  K  by  removing  the  last  rule.  Now,  there  are  T-stable 
models  for  also  when  D  corresponds  to  a  graph  without  kernel.  Consider  now 
the  query  Q^'  =  {K' ,  -^kernel).  We  have  that  SXPrs^{Q^‘)  =  SXPrs^^{Q^')  = 
D  ,  that  is  the  set  of  all  graphs  without  kernel;  i.e.,  under  both  certain  and 
definite  T-stable  model  semantics,  defines  the  coA/'P-complete  problem  of 
whether  a  graph  has  no  kernel.  This  query  is  meaningless  under  possible  and 
unique  T-stable  model  semantics.  □ 

From  Fact  2  it  follows  that,  as  far  as  the  expressive  powers  are  concerned, 
definite  semantics  subsumes  the  other  two  semantics  which,  in  turn,  are  incom¬ 
parable  with  each  other  (unless  MV  —  coJ^V).  In  the  next  section  we  characterize 
the  expressive  power  of  unique  T-stable  model  semantics. 


3  Expressive  Power  of  Unique  Stable  Model  Semantics 

In  this  section  we  prove  that  unique  T-stable  model  semantics  captures  the  whole 
class  ,  consisting  of  all  database  collections  D  that  can  be  defined  using 

an  existential  second-order  formula  of  the  form  (3!T)^(T)  with  unique  witnesses 
for  the  second-order  quantifiers,  i.e.,  there  are  unique  relations  in  T  satisfying 
the  first-order  formula  ^(T)  on  a  finite  structure  7)3.  Obviously  every  problem 
in  ITW  is  also  in  173  (the  class  of  problems  with  unique  solution  [4]);  however, 
not  every  problem  in  U3  can  be  written  in  the  above  logic  form  [15].  The  class 
l/W  includes  coMV  whereas  it  is  not  known  whether  it  also  includes  the 
latter  question  is  equivalent  to  the  question  of  whether  equals  to  UV\/  (and 
to  U3  as  well). 

The  formula  (3!T)0(T)  is  in  Skolem  normal  form  if  the  first-order  formula 
#(T)  is  in  the  following  format: 

^(T)  =  (Vx)(3y)(0i (T,  X,  y)  V  ...  V  ©.(T,  x,  y)). 

Next  we  show  that  any  existential  second-order  formula  with  unique  witnesses 
can  be  brought  into  Skolem  normal  form  as  it  happens  for  formulas  with  multiple 
witnesses. 

Lemma  3.  Given  a  second  order  formula  F  =  (3!T)^(T),  there  ts  a  a  Skolem 
normal  form  formula  which  is  equivalent  to  F . 

Proof.  We  first  bring  ^(T)  in  prenex  normal  form  and  then  apply  repeatedly 
the  equivalence 

(Vu)(3v)0(u,v)  (3!S){(Vu)(Vv)[5(u,  v)  ^  0(u,  v)]  A  (Vu)(3v)5(u,  v)} 
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Observe  that  our  “Skolemization”  differs  from  the  classical  one  for  existen¬ 
tial  second  order  formulas  with  multiple  witnesses  [7,  16]  essentially 

V)  0(u,v)  is  replaced  by  5(u,v)  e.  0(u,v).  Thus,  we  require  that  the 
chosen  relation  for  S  be  maximal,  i.e.,  it  exactly  contains  all  the  tuples  (ii.u) 
satisfying  0  in  addition  to  have  at  least  one  of  such  tuples  for  every  «  as  m  the 
classical  Skolemization.  Therefore,  as  the  maximal  relation  ^ 
unique  also  35  of  classical  Skolemization  can  be  replaced  by  3!5^  Note  that  oui 
procedure  of  Skolemization  in  general  requires  more  steps  than 
because  the  implication  S(u,v)  -  0(u,  v)  corresponds  to  5(u,v)  V^0(u,  v)  so 
that  negation  must  be  suitably  propagated  inside  0  by  inverting  quantifiers  and 

logical  connectives. 


Theorem  4.  &VPts^[Q]  —  DB-UW .  >  n\ 

Proof.  [Proof  of  SXPrsA<^]  C  DB-UW]  Take  any  query  Q  =  (TP,0^ 
without  loss  of  generality  assume  that  G  is  a  zero-arity  atom  ff  Given  D  - 
£fP-r-.{Q)  we  Lve  to  show  that  D  is  in  DB-UW,  i.e.,  there  exists  an  existen¬ 
tial  second-order  formula  defining  D  of  the  format  (3 '  S)#(S)  where  #(S)  is 
a  first  order  formula.  By  the  definition  of  unique  ^-stable  model  semantics 
database  D  on  VScP  is  in  D  if  and  only  if  the  following  condition  hold,  (i)  he  e 
exists  exactly  one  T-stable  model  for  CPn  and  (ii)  is  in  exac  ly 
model  for  CPd-  To  complete  the  proof,  it  is  sufficient  to  show  that  each  of  the 
above  two  conditions  is  in  UW.  Observe  that  Condition  (i)  is  not  siibsiimed  by 
Condition  (ii);  in  fact,  the  latter  condition  does  not  forbid  to  have  other  T-sta  e 

models  containing -^5'.  ,  i  r  i 

Condition  (i)  can  be  expressed  by  the  second-order  formula  (  .  )  (  )  _ 

the  database  scheme  VScP  as  follows.  S  has  a  relation  symbol  ^ach  IDB 
predicate  symbol  of  CP  and  selecting  relations  s  for  S  defines  a  set  M(s)  of 
ground  literals  {s(f)l  s  6  S  and  f  is  a  tuple  in  the  relation  of  s  corresponding  o 
s}.  We  define  P  in  such  a  way  that,  for  each  database  D  on  VScp,  . 

and  only  if  M(s)  is  a  T-stable  model  of  TPz,;  therefore,  the  formula  (3  !  S)  P  (S)  is 
satisfied  if  there  exists  a  unique  T-stable  model  for  CPd.  But  testing  T-stability 
is  in  P  by  part  1  of  Fact  1.  So,  as  P  C  coMV  C  UW.  r(s)  can  be  expressed  by  a 
second-order  formula  (3  !  S2)G(si,  Sj)  where 

Hence,  Condition  (i)  is  defined  by  the  formula:  (3  !  Si,  S2)  «  (^1.  =2h 

It  is  now  easy  to  see  that  also  Condition  (ii)  is  in  f/W.  Indeed,  take  the  above 
formula  (3  !  S)r(S)  with  the  following  extended  condition:  for  each  database  Ti 
on  PS  CP  T(s)  is  true  if  and  only  if  both  (i)  M(s)  is  a  T-stable  model  o  ^d 
and  (ii)  g  is  in  Af(s).  Let  s  be  the  relation  symbol  in  s  corresponding  to  3.  Then 
r(sl  can  be  now  expressed  by  a  second-order  formula  (3  !  S2)  (lAs.  »2j  A  sy;, 
where  G,  defined  as  above,  tests  T-stability  of  M(s)  and  s  checks  membership 

^[Fro^/lf  DB-IW  C  SXPtsA^]]  Take  any  database  collection  D  on  a 
database  scheme  PS  whose  recognition  is  in  WW.  Then,  by  emma  can 
be  defined  by  a  Skolem  normal  form  second  order  formula,  say: 

(3 1  S)(Vx)(3y)(0i(S,x,y)  V  ...  V  0t(S,x,y)). 
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It  is  now  easy  to  prove  that  D  is  the  database  set  of  a  query  under  unique 
T-stable  model  semantics.  Indeed,  consider  the  query  Q  -  {CP,->g),  where  CP 
is: 

ri  :  5^-(Wj)  ^  (1  <  i  <  m)  ■  9  ^ 

ro  :  %(Wj )  ^  -5j(W^).  (1  <  j  <  m)  9, 

r;:  q(X)  -  6).(X,  Y).  (1  <  i  <  A:) 

Let  D  be  a  database  on  VS  =  VScp.  We  construct  a  T-stable  model  for  CP d  as 
follows.  For  each  tuple  w^,  the  first  two  groups  of  rules  make  true  either  5j(wj) 
or  Sj  );  using  these  rules,  we  perform  a  non-deterministic  selection  of  relations 
for  S.  For  each  x,  rules  3  makes  g(x)  true  if  there  exists  some  y  for  which  one  of 
Si  is  satisfied.  By  rules  4,  p  is  false  if  and  only  if  the  selected  relations  for  S  are 
witnesses  for  ^(S)  (i.e.,  for  each  x,  ^(x)  is  true).  By  rule  5,  p  is  not  undefined  if 
and  only  Mg  is  made  false;  so  the  role  of  this  rule  is  to  invalidate  any  selection  for 
S  that  does  not  make  g  false.  Therefore,  the  program  CPd  admits  a  number  of 
T-stable  models,  one  for  every  witness  for  ^(S).  Hence,  if  D  G  D  then  there  is  a 
unique  witness  for  ^(S)  and,  therefore,  a  unique  T-stable  model  of  CPdi  say  M; 
since  — ^g  G  -^4^5  C)  G  CXP (^)  as  well.  On  the  other  hand,  if  D  ^  D  then  CP £) 
admits  either  no  T-stable  model  or  multiple  T-stable  models,  so  D  ^  {Q)- 

It  turns  out  that,  D  ~  EXPrs^(Q)]  therefore,  DB-UW  C  S^rs^[Q]'  ° 

We  point  out  that  this  is  not  the  first  time  that  a  relationship  between 
DATAL0G“’  and  the  class  UW  is  discovered:  DATALQG^  programs  with  unique  fix- 
point  are  characterized  in  terms  oi  UW  in  [16]. 

As  coAfP  C  l/W  C  from  Theorem  4  and  Fact  2  we  derive  that,  measured 
in  terms  of  expressive  powers,  unique  semantics  subsumes  certain  semantics  and, 
in  turn,  it  is  subsumed  by  definite  semantics.  The  relationships  among  the  various 
versions  of  T-stable  model  semantics  is  depicted  in  Fig.  1. 


Figure  1:  Relationships  among  T-stable  semantics 


Example  2.  In  Example  1,  we  have  shown  that,  under  the  unique  T-stable  model 
semantics,  the  query  —  {K,  kernel)  defines  the  problem  of  whether  a  graph 
has  exactly  one  kernel  —  this  problem  is  a  typical  problem  in  UW .  Since  coMP  C 
UW ,  according  to  Theorem  4  unique  T-stable  model  semantics  is  also  able  to 
express  the  coA/T^-complete  problem  of  whether  a  graph  has  no  kernel.  We  next 
show  how  to  modify  the  query  to  formulate  this  problem. 
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Let  K"  be  the  program  obtained  from  K  by  modifying  the  first  two  rules  into: 

vl(X)^b,  v(X),  -^v2(X).  v2(X)^b,  v(X),  -.vl(X). 

and  by  adding  the  following  three  rules: 

a  ^  -ib.  b  -na.  kernel  ^  a. 

Under  the  unique  T-stable  model  semantics,  the  query  kernel) 

expresses  the  problem  of  whether  a  graph  has  no  kernel.  In  fact,  the  interpre¬ 
tation  M  -  {a,  kernel}  U  is  a  T-stable  model  for  for  any  database  D. 
Moreover,  the  program  has  an  additional  T-stable  model  for  every  kernel  in  the 
graph  G  corresponding  to  D]  for  such  models  a  is  false  and  b  is  true.  Therefore, 
has  a  unique  T-stable  model  (that  is,  M)  if  and  only  if  G  has  no  kernel.  □ 


4  Subclasses  of  Queries  with  Unique  T-stable  Model 

So  far  we  have  analyzed  various  types  of  deterministic  semantics  for  the 
class  Q  of  all  DATALOG"^  queries.  In  this  subsection  we  consider  two  interesting 
subclasses  of  Q  for  which  the  unique  T-stable  model  semantics  is  the  natural 
semantics: 

-  Qi  zz  {Q  =  {CP,  G)  I  on  VS  CP,  CP  D  admits  a  unique  T-stable  model} 

-  Qo  1  =  {Q  =  {CP,G)  I  VD  on  VScp.CPd  admits  at  most  one  T-stable  model} 

Obviously,  Qi  C  Qo,i  C  Q.  Note  that,  while  Q  is  a  recursive  query  language, 
the  two  sub-classes  are  not  recursively  enumerable  as  it  is  not  in  general  decidable 
whether  a  DATALOG"  program  has  a  unique  T-stable  model  for  every  database. 
Therefore,  Qi  and  Qo,i  are  not  query  languages  in  the  sense  of  [13]. 

The  two  subclasses  blur  the  differences  among  the  various  T-stable  model 
semantics. 

Propositions. 

1.  For  each  Q  6  Qi,  (Q)  =  S^CPj-gv{Q)  =  SXPj-^'^\{Q)  —  SXP 'rs'^{Q)l 

2.  For  each  Q  G  Qo,i,  SXP^s'^(Q)  —  SXP'is^'.{Q)  —  SATrs^iQ)' 

Proof.  .  Let  Q  —  {CP,G)  and  H  be  a  database  on  VScp-  If  Q  is  in  Qi 
then  CPd  has  exactly  one  T-stable  model:  so  all  semantics  coincide.  Suppose 
now  that  Q  G  Qo,i-  if  has  no  T-stable  model  then  only  certain  semantics 
behaves  differently  from  the  other  semantics.  D 

Next  we  characterize  the  expressive  power  of  T-stable  model  semantics  for 
the  two  subclasses  of  queries.  To  this  end  we  need  to  consider  further  database 
classes,  first  introduced  in  [15]: 


^  The  database  D  is  seen  as  a  set  of  ground  atoms,  one  for  each  tuple. 
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1.  DB-UE\  denotes  the  subset  of  DB-UW  consisting  of  each  database  collection 
D  which  is  defined  by  a  formula  (3!T)^(T)  such  that  for  each  D  ^  D, 
(3!T)^(T)  is  false,  i.e.,  for  each  D,  either  the  formula  is  satisfied  by  exactly 
one  witness  (and,  then,  D  E  D)  or  it  is  not; 

2.  DBMA\  denotes  the  subset  of  DB-UE\  of  all  database  collections  D  for 
which  the  complementary  database  collection  D  '  is  in  DB-l(E\  as  well. 

As  discussed  in  [15],  UE\  is  related  to  the  complexity  class  l/P  that  has 
been  introduced  by  Valiant  [22]  and  consists  of  all  unambiguous  computations. 
Indeed  UE\  captures  UP  if  an  order  on  the  universe  is  available;  in  this  case, 
U  A\  captures  UP  0  coUP . 

Lemma  6.  Given  a  second  order  formula  P  =  (3T)0(T)  in  UEj;,  there  is  a 
Skolern  normal  form  formula  in  U El  which  is  equivalent  to  P. 

Proof.  We  first  bring  ^(T)  in  prenex  normal  form  and  then  apply  repeatedly 
the  Skolemization  introduced  in  the  proof  of  Lemma  3.  It  is  easy  to  see  that  every 
relation  symbol  added  by  the  skolemization  admits  at  most  one  witness.  □. 

Theorem  7. 

1.  DB-UA\  C  PXPrs^lQi]  C  DB-AfP  n  DB-coNP; 
t  DB-UA\  C  eXPrsA^^.i]  ^  DB-MV  n  DB-UW. 

Proof.  (1)  Since  EXP^s^  [Q]  DB-NV  by  Fact  2  and  Qi  C  Q,  C 

DB-AfP-,  so,  as  eXP^sBlQi]  =  PXPrs^[Qi]  by  Proposition  5,  SXPrs^[Qi]  Q 
DB-MV.  By  replacing  3  with  V  and  NP  with  coMV  and  repeating  the  pre¬ 
vious  argument  we  obtain  ^Tp7'5i[Qi]  C  DB-coMV .  Hence,  ^Tp7'^i[Qi]  C 
DB-NVPDB-coMV .  Let  us  now  prove  the  other  relationship.  Let  D  be  a  database 
collection  in  DB-UA\,  say  with  database  scheme  VS.  Let  D'  be  the  complemen¬ 
tary  database  collection  of  D.  Then,  by  definition  of  7/A},  D  and  D'  are  defined 
by  two  formulas  in  UE\,  say  {3S)(j)(S)  and  {3S')(I)\S'),  respectively.  By  Lemma 
6,  we  can  assume  that  both  formulas  are  in  Skolern  format  say: 

(j){S)  =  (Vx)(3y)(6)i(S,x,y)  V  ...V  0^(8, x,y)), 

=  (VxO(3yO(0; (S^  x',  yO  V  . . .  V  0^(S',  x',  y^). 

Consider  the  program  CP": 

ri  :  a  ^  -i/;.  r2  :  b  -^a. 

ra:  )  -  a,, (1  <  i  <  m) 

r4  :  Sj{Wj)  -  a,^Sj(Wj).  (I  <  j  <  m) 
rr,:  q{X)  ^0,(X,Y).  (1  <  i  <  A:) 

rc-  9  ^ 

ry  :  ^  b,  (1  <  i  <  m') 

rs  :  Pj{W')  -  6,  (1  <  i  <  m') 

r,  :  g'(X')  -(9:(X^Y0.  {l<i<k') 

no:  g' 
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?-ii  :  g"  ^  ^9-  n2  ■  g"  ^  -^g'-  nz-  -^g",  -'P 

The  program  JCP'^  consists  of  two  subprograms:  jCP  (rules  3-6)  and  CP"  (rules 
7-10)  plus  the  first  two  rules  which  enable  one  of  the  two  subprograms  plus 
the  rules  11-13  which  make  p  undefined  iff  neither  g  nor  is  false.  Observe 
that,  under  the  TS^  semantics,  the  queries  Q  =  {jCP,~^g)  and  Q'  -  {CP' ,-^g') 
defines  D  and  D^,  respectively;  moreover  for  each  D,\i  D  £T>  (resp.  D')  then 
there  exists  exactly  one  T-stable  model  M  for  CPd  (resp.,  CP'jj)  such  that 
-«(/  G  M.  It  is  then  easy  to  see  that  for  each  D,  CP'^y  has  exactly  one  T-stable 
model.  Therefore,  the  query  Q'^  =  is  in  Qi  and  ~  D;  so 

DB-UA\  C  EXPrs^lQi]. 

(2)  AsQi  C  Qo,i,byPart(l),Te-^7z\;  C  [Qo,i].  Concerning  the  sec¬ 
ond  relationship,  we  have  that  EXPj-^3[Q]  —  DB-MV  and  =  DB-UW 

by  Fact  2  and  Theorem  4.  Therefore,  as  €^rs^[Qo,i]=  EXP^^gi  [Qo,i]  by  Propo¬ 
sition  5,  we  derive  that  [Qo,i]  CDB-MP  fl  DB-UW.  D 

Note  that  classes  of  queries  whose  expressive  power  is  bounded  by  MVOcoAfV 
have  been  studied  in  [12,  11,  10]  and  that  also  such  classes  are  characterized  by 
similar  uniqueness  conditions. 

The  above  results  are  rather  negative  with  respect  to  the  possibility  to  single 
out  a  subclass  of  Qi  or  Qo,i  which  can  be  expressed  by  a  query  language  more 
powerful  than  stratified  DATALDG”'.  In  fact,  as  the  classes  J\fP  OUW  and  J\fV  fl 
coNP  as  well  as  any  known  subclass  of  them  over  V  are  not  syntactic  unless 
something  surprising  is  true  (e.g.,  J^P  C  UW,  NV  =  coMV  or  J\fV  H  coUV  =  V) 
[13],  it  turns  out  that  any  query  language  in  Qr<Si  or  Qt<So,i  cannot  express 
more  than  P.  But  it  is  not  know  either  whether  P  is  expressible  by  a  query 
language  and  whether  there  exists  a  language  for  total  well-founded  semantics 
preserving  the  capability  of  expressing  all  fixpoint  queries  [1,  10].  Flum  et  al.  have 
recently  shown  in  [8]  that  total  well-founded  semantics  has  the  same  expressive 
power  as  ‘partial’  well-founded  semantics.  However,  this  result  refers  to  database 
equivalence  in  the  sense  that  a  ‘partial’  query  on  a  database  can  be  replaced  by 
the  same  query  on  a  different  database  yielding  a  total  model.  Thus  they  have 
not  proved  the  existence  of  a  language  L  with  the  power  of  fixpoint  queries  which 
only  generates  queries  whose  well-founded  models  are  total  for  every  database. 
So  follows  our  conjecture  that  any  practical  language  for  DATALOG"*  queries  with 
unique  T-stable  model  is  not  more  expressive  than  stratified  DATALOG"": 

Conjecture  1  Given  any  subset  Q'  of  QrSi ,  */  recursively  enumerable 

then  EXPrs^lQ']  C  EXPrs^[Q”],  where  Q"'  is  the  class  o/ a// DATALOG"’  qzieries 
with  stratified  negation.  C 
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